March 5, 2020

Orange, Presto, Pentaho, Luigi, Scriptella - ETL, Analytics, Business Intelligence

This page is work in progress.



This is note on installation and testing with DBs, Big Data, ETL, analytics, business intelligence software.

Environments, DBs

Below list shows where DBs and Software are installed:
  • Win10 – MySQL DB, Orange, PySpark
  • CentOS7 – Oracle 18c XE
  • Ubuntu18, headless – MariaDB, Postgresql, Hadoop, HBase, Hive, Spark/PySpark, Mahout, PrestoDB, Pentaho, Luigi, Scriptella
  • Ubuntu18 – DB2

Installation Reference

 

Play Time...

 

Orange - Analytics, data mining

https://orange.biolab.si/

If behind the FW and/or Proxy, and conda install doesn't work, then download and install:
https://download.biolab.si/download/files/

If not, use Anaconda install steps:
> conda config --add channels conda-forge
> conda install orange3


To run:
> activate <conda environment with Orange>
> orange-canvas


Orange is very promising, great features and GUI - but not mature enough.  It only supports Postgresql DB for example.

Spark - Analytics

https://spark.apache.org/

PrestoDB

https://prestodb.io/- Presto is distributed SQL query engine, connecting to multiple/multi-type DBs, such as Hadoop, RDBMS, NoSQL

Pentaho - ETL, Analytics, Report

https://www.pentaho.com/ - Pentaho is consist of multiple packages: ETL, Analytics, Business Intelligence.  

Flowable 

https://flowable.com - Java based.  Seems pretty good.

Luigi - Python based ETL

https://github.com/spotify/luigi

Developed by Spotify.  Looks pretty promising.

Python Based Tools

Above are all similar - python coding based ETL.

Singer

https://www.singer.io

Very different concept - using Python but shell with pipe.  Feels like IFTTT for ETL in shell running on local machine.

Scriptella - XML based ETL in Java

Written in Java, https://scriptella.org
If you're familiar with Spring Framework, Spring Batch is another option - https://mkyong.com/tutorials/spring-batch-tutorial/


Worth Mentioning


DB Browser

Free, simple, and works.

Other Tools



Other Lists



No comments: