Install Pentaho Server
Installation of Pentaho Server components ..
Last updated
Was this helpful?
Installation of Pentaho Server components ..
Last updated
Was this helpful?
This section will guide you through the installation of the Pentaho Server:
create installation directories
create Pentaho Repository databases
configure JDBC database connections
start Pentaho server - systemd
license manager
The Pentaho server is a web application that runs in an Apache Tomcat servlet container.
Create /opt/pentaho server directories.
cd
sudo mkdir -p /opt/pentaho/{server,software}
* server - server zip packages
* software - Pentaho binaries
Create /opt/pentaho/software sub-directories.
cd
cd /opt/pentaho/software
sudo mkdir -p {server,shims,ee-plugins,db_drivers}
* server - server binaries
* shims - collections of Hadoop libraries required to communicate with a specific version of Hadoop
* ee-pligins - pentaho ee-plugins
* db_drivers - database drivers
The jar command is a general-purpose archiving and compression tool, based on ZIP and the ZLIB compression format.
x - Extract files from a JAR archive
f - Sets the file specified by the jarfile operand to be the name of the JAR file that is created
Copy Pentaho server package.
cd
cd ~/Downloads/'Archive Build (Suggested Installation Method)'/
sudo cp * /opt/pentaho/software/server
Unjar pentaho-server-ee-10.2.0.0-222.zip to /opt/pentaho/server/
cd
cd /opt/pentaho/server
sudo jar -vxf /opt/pentaho/software/server/pentaho-server-ee-10.2.0.0-222.zip
Change the permission for all .sh files.
All .sh files will need executable permission.
cd
cd /opt/pentaho/server
sudo find . -iname "*.sh" -exec bash -c 'chmod +x "$0"' {} \;
find [obvious!]
. - from this folder. You can put a path instead
-iname - case insensitive name
"*.sh" - wildcard filename
-exec - utility to execute commands
bash - what tool you want to use (you can use sh instead)
-c flag means execute the following command as interpreted by this program.
chmod +x - command to change the file to executable
"$0" - The value that was passed to the utility
{} - If the string {} appears anywhere in the utility name or the arguments it is replaced by the pathname of the current file.
; - Terminates the command
Check that it matches the following directory structure:
/opt/pentaho/
server/
pentaho-server/
pentaho-solutions/
system
The server plugins are installed into the system folder.
The Pentaho Repository resides on the database that you installed during the Windows or Linux environment preparation step, and consists of the following components:
Contains the solution repository, examples, security data, and content data from reports that you use Pentaho software to create.
Holds data that is related to scheduling reports and jobs.
Holds data that is related to audit logging.
Report on system usage and performance.
For your production server, Pentaho recommends that you change the default passwords in the following SQL script files to make the databases more secure.
Examine postgresql database scripts.
cd
cd /opt/pentaho/server/pentaho-server/data
ls -l
cd postgresql
ls -l
For this workshop we're going to keep the defaults user and password.
create_jcr_postgresql.sql
create_quartz_postgresql.sql
create_repository_postgresql.sql
pentaho_logging_postgresql.sql
pentaho_mart_postgresql.sql
Examine postgresql database scripts.
cd
cd /opt/pentaho/server/pentaho-server/data/postgresql
cat create_jcr_postgresql.sql
--
-- note: this script assumes pg_hba.conf is configured correctly
--
-- \connect postgres postgres
drop database if exists jackrabbit;
drop user if exists jcr_user;
CREATE USER jcr_user PASSWORD 'password';
CREATE DATABASE jackrabbit WITH OWNER = jcr_user ENCODING = 'UTF8' TABLESPACE =>
GRANT ALL PRIVILEGES ON DATABASE jackrabbit to jcr_user;
Exit.
CTRL x
Client authentication is controlled by pg_hba.conf and is stored in the database cluster's data directory. (HBA stands for host-based authentication.)
You need ensure that users defined in the scripts are able to be authenticated to connect to the tables.
Edit pg_hba.conf.
cd
nano /etc/postgresql/15/main/pg_hba.conf
Replace 'peer' for local connection with 'md5'.
The MD5 (message-digest algorithm) hashing algorithm is a one-way cryptographic function that accepts a message of any length as input and returns as output a fixed-length digest value to be used for authenticating the original message.
Save.
CTRL + o
Enter
CTRL + x
Restart service.
sudo service postgresql restart
You will find SQL scripts for each supported Repository database:
PostgreSQL
Oracle
MS Sql Server
MySQL
MariaDB
Check postgresql repository is running.
sudo systemctl status postgresql
Select postgresql scripts directory.
cd
cd /opt/pentaho/server/pentaho-server/data/postgresql
ls -l
-rw-r--r-- 1 root root 464 Aug 7 07:42 alter_script_postgresql_BISERVER-13674.sql
-rw-r--r-- 1 root root 363 Aug 7 07:40 create_jcr_postgresql.sql
-rw-r--r-- 1 root root 5647 Aug 7 07:40 create_quartz_postgresql.sql
-rw-r--r-- 1 root root 356 Aug 7 07:40 create_repository_postgresql.sql
-rw-r--r-- 1 root root 4016 Aug 7 07:43 pentaho_logging_postgresql.sql
-rw-r--r-- 1 root root 1220 Aug 7 07:43 pentaho_mart_drop_postgresql.sql
-rw-r--r-- 1 root root 19035 Aug 7 07:43 pentaho_mart_postgresql.sql
-rw-r--r-- 1 root root 286 Aug 7 07:43 pentaho_mart_upgrade_audit_postgresql.sql
-rw-r--r-- 1 root root 7533 Aug 7 07:43 pentaho_mart_upgrade_postgresql.sql
Log in as 'pentaho' superuser.
sudo -su pentaho psql postgres
Passw0rd123
💡The default password for each user: password
💡You can switch database in PostgreSQL with the command: \c
\i create_jcr_postgresql.sql
Jackrabbit
\i create_quartz_postgresql.sql
Quartz
pentaho_user/password
\i create_repository_postgresql.sql
Hibernate
\i pentaho_mart_postgresql.sql
OpsMart
hibuser/password
\i pentaho_logging_postgresql.sql
Logging
hibuser/password
Ensure you're in the
/opt/pentaho/server/pentaho-server/data/postgresql directory.
Refer to the table above for username / password.
You may need to \q and log back in.
Execute the scripts.
\i create_jcr_postgresql.sql
\i create_quartz_postgresql.sql
Enter password:
password
Quit:
\q
Log back in as 'pentaho' superuser.
sudo -su pentaho psql postgres
Welcome123
\i create_repository_postgresql.sql
\i pentaho_mart_postgresql.sql
Enter password:
password
Quit:
\q
Log back in as 'pentaho' superuser.
sudo -su pentaho psql postgres
Welcome123
\i pentaho_logging_postgresql.sql
Enter password:
password
Quit:
\q
Check the databases in pgAdmin.
No tables are created in the Hibernate and Jackrabbit databases. These are created during the installation of the Pentaho server.
Now that you have initialized your repository database, you will need to configure Quartz, Hibernate, Jackrabbit, and Pentaho Operations Mart for a PostgreSQL database.
PostgreSQL is configured by default; if you kept the default passwords and port, you will not need to set up Quartz, Hibernate, Jackrabbit or the Pentaho Operations Mart.
By default, the examples in this section are for a PostgreSQL database that runs on port 5432. The default password is also in these examples. If you have a different port or different password, make sure that you change the password and port number in these examples to match the ones in your configuration.
Event information, such as scheduled reports, is stored in the Quartz JobStore. During the installation process, you must indicate where the JobStore is located by modifying the quartz.properties file.
Navigate to quartz directory.
cd
cd /opt/pentaho/server/pentaho-server/pentaho-solutions/system/scheduler-plugin/quartz
sudo nano -c quartz.properties
Locate the #_replace_jobstore_properties section and check.
[line 300/451]
org.quartz.jobStore.driverDelegateClass =
org.quartz.impl.jdbcjobstore.PostgreSQLDelegate
Locate the # Configure Datasources section and check.
[line 379/451]
org.quartz.dataSource.myDS.jndiURL = Quartz
Exit.
CTRL + x
Modify the Hibernate settings file to specify where Pentaho should find the Pentaho Repository’s Hibernate configuration file. The Hibernate configuration file specifies driver and connection information, as well as dialects and how to handle connection closes and timeouts.
The Hibernate database is also where the Pentaho Server stores the audit logs that act as source data for the Pentaho Operations Mart.
Navigate to hibernate-settings directory.
cd
cd /opt/pentaho/server/pentaho-server/pentaho-solutions/system/hibernate
sudo nano -c hibernate-settings.xml
Locate the config-file section and check.
[line 36/53]
<config-file>system/hibernate/postgresql.hibernate.cfg.xml</config-file>
Exit.
CTRL + x
Display postgresql.hibernate.cfg.xml.
sudo nano -c postgresql.hibernate.cfg.xml
Check postgresql has been set as default.
[line 35/51]
<!-- Postgres 8 Configuration -->
<property name="connection.driver_class">org.postgresql.Driver</property>
<property name="dialect">org.hibernate.dialect.PostgreSQLDialect</property>
<property name="hibernate.connection.datasource">java:comp/env/jdbc/Hibernat>
<property name="connection.pool_size">10</property>
<property name="show_sql">false</property>
<property name="hibernate.jdbc.use_streams_for_binary">true</property>
<!-- replaces DefinitionVersionManager -->
<property name="hibernate.hbm2ddl.auto">update</property>
<!-- load resource from classpath -->
<mapping resource="hibernate/postgresql.hbm.xml" />
<!-- mapping resource above is from CE; below is from EE -->
<mapping resource="hibernate/postgresql.EE.hbm.xml" />
</session-factory>
</hibernate-configuration>
Exit.
CTRL + x
Apache Jackrabbit is a platform of java open source content repository. A JCR (Java content repository) is a type of object database to customizing, storing, searching and retrieving hierarchical data.
As shown in the table below, locate and verify or change the code so that the PostgreSQL lines are not commented out, but the MySQL, Oracle, and MS SQL Server lines are commented out.
If you have a different port or different password, make sure that you change the password and port number in these examples to match the ones in your configuration.
Navigate to jackrabbit directory.
cd
cd /opt/pentaho/server/pentaho-server/pentaho-solutions/system/jackrabbit
sudo nano -c repository.xml
line 71/442
Repository
Filesystem schema: postgresql
line 129/442
Datastore
databaseType: postgresql
line 231/442
Workspace
Filesystem schema: postgresql
line 279/442
Persistence Manager (1)
PersistenceManager schema: postgesql
line 347/442
Versioning
Filesystem schema: postgresql
line 398/442
Persistence Manager (2)
PersistenceManager schema: postgesql
line 434/442
Database Journal
Journal schema: postgresql
Exit.
CTRL + x
After your Repository has been configured, you must configure the web application servers to connect to the Pentaho Repository. In this step, you will make JDBC and JNDI connections to the Hibernate, Jackrabbit, and Quartz components.
To connect to a database, including the Pentaho Repository database, you will need to download and install a JDBC driver to the appropriate places for Pentaho components as well as on the the web application server that contains the Pentaho Server.
Due to licensing restrictions, Pentaho cannot redistribute some third-party database drivers. You must download the file yourself and install it yourself.
For this workshop we're going to distribute a MySQL driver ..
Copy the JDBC drivers to jdbc-distribution.
cd
cd ~/Downloads/'Database Drivers'/
sudo cp * /opt/pentaho/software/db_drivers
cd
cd /opt/pentaho/software/db_drivers
sudo cp mysql-connector-j-9.0.0.jar /opt/pentaho/server/jdbc-distribution
Distribute the drivers.
cd
cd /opt/pentaho/server/jdbc-distribution
sudo ./distribute-files.sh /opt/pentaho/server/pentaho-server/tomcat/lib
/opt/pentaho/server/jdbc-distribution
DEBUG: Using PENTAHO_JAVA_HOME
DEBUG: _PENTAHO_JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64
DEBUG: _PENTAHO_JAVA=/usr/lib/jvm/java-17-openjdk-amd64/bin/java
You must restart your Pentaho Server and Client tools to begin using the new drivers.
You will need to restart the Pentaho Server and Client Tools to register the driver.
Multiple distribution paths can be set, separated by a 'space'.
reboot
Location of JDBC drivers in Pentaho+:
Pentaho Server
/server/pentaho-server/tomcat/lib
Pentaho Data Integration (Spoon)
/design-tools/data-integration/lib
Pentaho Report Designer (PRD)
/design-tools/report-designer/lib/jdbc
Pentaho Aggregation Designer (PAD)
/design-tools/aggregation-designer/drivers
Pentaho Schema Workbench (PSW)
/design-tools/schema-workbench/drivers
Pentaho Metadata Editor (PME)
/design-tools/metadata-editor/libext/JDBC
Check that the driver(s) have been added .. sometimes the distribution tool fails.. !!
Database connection and network information, such as the username, password, driver class information, IP address or domain name, and port numbers for your Pentaho Repository database are stored in the context.xml file.
Check context.xml.
cd
cd /opt/pentaho/server/pentaho-server/tomcat/webapps/pentaho/META-INF
sudo nano -c context.xml
In a Production environment, check the username, password, driver class information, IP address (or domain name), and port numbers to match the correct values for your environment.
Exit.
CTRL + x
Now that you have completed the initial Pentaho Archive installation steps, you are ready to start the Pentaho Server.
Switch to pentaho-server directory.
cd
cd /opt/pentaho/server/pentaho-server
sudo ./start-pentaho.sh
DEBUG: Using PENTAHO_JAVA_HOME
DEBUG: _PENTAHO_JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64
DEBUG: _PENTAHO_JAVA=/usr/lib/jvm/java-17-openjdk-amd64/bin/java
DEBUG: PENTAHO_LICENSE_INFORMATION_PATH=
Using CATALINA_BASE: /opt/pentaho/server/pentaho-server/tomcat
Using CATALINA_HOME: /opt/pentaho/server/pentaho-server/tomcat
Using CATALINA_TMPDIR: /opt/pentaho/server/pentaho-server/tomcat/temp
Using JRE_HOME: /usr
Using CLASSPATH: /opt/pentaho/server/pentaho-server/tomcat/bin/bootstrap.jar:/opt/pentaho/server/pentaho-server/tomcat/bin/tomcat-juli.jar
Using CATALINA_OPTS: -Xms2048m -Xmx6144m -Djava.library.path=/opt/pentaho/server/pentaho-server/pentaho-solutions/native-lib/linux/x86_64/ -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000 -Dfile.encoding=utf8 -Djava.locale.providers=COMPAT,SPI -DDI_HOME="/opt/pentaho/server/pentaho-server/pentaho-solutions/system/kettle"
Tomcat started.
Tail the log (new terminal).
cd
sudo tail -f /opt/pentaho/server/pentaho-server/tomcat/logs/catalina.2024-*.log
....
31-Jul-2024 17:20:03.810 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["http-nio-8080"]
31-Jul-2024 17:20:03.827 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in [61207] milliseconds
The Pentaho User Console (PUC) is a web-based design environment where you can analyze data, create interactive reports, dashboard reports, and build integrated dashboards to share business intelligence solutions with others in your organization and on the internet.
Username: admin
Password: password
Systemd is a system and service manager for Linux operating systems. It's designed to be backwards compatible with SysV init scripts, and provides several features to start system services in parallel, which can reduce boot times.
To use this service:
Save the file below as: /etc/systemd/system/pentaho-server.service
[Unit]
Description=Pentaho Server
Before=multi-user.target
Before=graphical.target
After=network.service
After=network.target
After=syslog.target
[Service]
Type=forking
Environment="JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64"
ExecStart=/opt/pentaho/server/pentaho-server/start-pentaho.sh
ExecStartPost=/bin/echo pentaho...end of unitfile
ExecStop=/opt/pentaho/server/pentaho-server/stop-pentaho.sh
TimeoutSec=500
IgnoreSIGPIPE=no
KillMode=process
GuessMainPID=no
RemainAfterExit=yes
SuccessExitStatus=5 6
User=root
[Install]
WantedBy=multi-user.target
Reload the systemd daemon.
sudo systemctl daemon-reload
Start the service.
sudo systemctl start pentaho-server
Enable the service to start on boot.
sudo systemctl enable pentaho-server
Created symlink /etc/systemd/system/multi-user.target.wants/pentaho-server.service → /etc/systemd/system/pentaho-server.service.
You can then manage the service using standard systemd commands:
To stop:
sudo systemctl stop pentaho-server
To restart:
sudo systemctl restart pentaho-server
To check status:
sudo systemctl status pentaho-server
The Pentaho Licensing model has changed in Pentaho Pro Suite 10.+ Licenses are now handled via a License Manager.
This embedded service (on-prem / cloud) will enable our customers (Direct & OEM) to manage their PDI & BA entitlements with greater visibility and ease.
The License manger also checks EE plugins.
You can obtain a Pentaho trial license to test the product before you acquire it. You need internet access to activate a trial license and run the installed trial version. It is not possible to run it for an extended period of time while disconnected from internet.
The temporary license expires thirty days after the start of the evaluation period.
To request a trial activation ID, go to the Pentaho download page on the Hitachi Vantara website. Alternatively, contact the Pentaho Sales team.
On the download page, click Start a Free 30 Day Trial to open the trial registration form.
Complete the trial registration form and click Submit. The trial entitlement and activation ID is sent to you by email.
Download and install the Pentaho product. When launching the product, the Add License window opens.
Select Activation Code.
Copy the activation ID that you received from the local license manager, paste it into the provided field, and click OK
If you are an existing customer wanting to upgrade from Pentaho 9.x or earlier supported versions, do not start the server before upgrading the licenses. You must install the new version of the product before activating the licenses.
Activate a license using a cloud license server
If you are able to access our cloud license server without any security restrictions, this is the quickest way to get up and running with Pentaho.
Copy the cloud license server URL that Hitachi Vantara emails you into the License Server field of the Add License dialog box that opens when you launch the product and click OK.
To ensure that the Pentaho Server uses the same location to store and retrieve your Pentaho licenses, you must create a PENTAHO_LICENSE_INFORMATION_PATH system environment variable for your Pentaho user account if it does not exist. It does not matter what location you choose; however, the location needs to be available to the user account(s) that run the Pentaho Server.
Perform the following steps to set the environment variable for the license path in Linux.
Edit the /etc/environment file.
cd
cd /etc
sudo nano environment
Add this line in a convenient place (changing the path if necessary)
#
export PENTAHO_LICENSE_INFORMATION_PATH=/home/pentaho/.pentaho/.elmLicInfo.plt
The license information file is saved in the /home/pentaho/.pentaho folder.
Log out and log back into the operating system for the change to take effect.
Verify that the variable is properly set using the following command.
env | grep PENTAHO_LICENSE_INFORMATION_PATH
The PENTAHO_LICENSE_INFORMATION_PATH variable is now set.