Linux Testing Environment
Update 19 Nov 2020:
- This guide has been updated for use with CentOS 8 and the newly released InfluxDB 2.x
Setting up a Linux testing environment using CentOS 8, InfluxDB 2.0.1 and the Anaconda Distribution in VirtualBox.
- 1 Things to know
- 2 Step 1: Operating System VM
- 3 Step 2: Database
- 3.1 Installing InfluxDB 2.0 on CentOS 8
- 3.2 Configure InfluxDB firewall on CentOS 8
- 3.3 Access InfluxDB via browser
- 3.4 Creating a database in InfluxDB 1.6
- 3.5 Insert some test data in InfluxDB 1.6
- 3.6 Configuring an admin user for InfluxDB
- 3.7 Check disk space used by a specific database
- 3.8 Open port in CentOS firewall
- 4 Step 3: Python
Things to know
From conda docs glossary:
A downloadable, free, open source, high-performance and optimized Python and R distribution. Anaconda includes conda, conda build, Python and 100+ automatically installed, open source scientific packages and their dependencies that have been tested to work well together, including SciPy, NumPy and many others. Use the
conda installcommand to easily install 1,000+ popular open source packages for data science—including advanced and scientific analytics—from the Anaconda repository. Use the
condacommand to install thousands more open source packages.
Because Anaconda is a Python distribution, it can make installing Python quick and easy even for new users.
Available for Windows, macOS and Linux, all versions of Anaconda are supported by the community.
More info here: Wikipedia / Official website
From the official website:
The CentOS Project is a community-driven free software effort focused on delivering a robust open source ecosystem. For users, we offer a consistent manageable platform that suits a wide variety of deployments. For open source communities, we offer a solid, predictable base to build upon, along with extensive resources to build, test, release, and maintain their code.
More info here: Official website
From conda docs glossary:
The package and environment manager program bundled with Anaconda that installs and updates conda packages and their dependencies. Conda also lets you easily switch between conda environments on your local computer.
More info here: Official conda docs
From the official docs:
InfluxDB is a time series database designed to handle high write and query loads. It is an integral component of the TICK stack. InfluxDB is meant to be used as a backing store for any use case involving large amounts of timestamped data, including DevOps monitoring, application metrics, IoT sensor data, and real-time analytics.
More info here: Official InfluxDB docs / Official website
From the official manual:
VirtualBox is a cross-platform virtualization application. What does that mean? For one thing, it installs on your existing Intel or AMD-based computers, whether they are running Windows, Mac, Linux or Solaris operating systems. Secondly, it extends the capabilities of your existing computer so that it can run multiple operating systems (inside multiple virtual machines) at the same time. So, for example, you can run Windows and Linux on your Mac, run Windows Server 2008 on your Linux server, run Linux on your Windows PC, and so on, all alongside your existing applications. You can install and run as many virtual machines as you like — the only practical limits are disk space and memory. VirtualBox is deceptively simple yet also very powerful. It can run everywhere from small embedded systems or desktop class machines all the way up to datacenter deployments and even Cloud environments.
More info here: Official manual
Step 1: Operating System VM
Install Linux in VirtualBox
- Install VirtualBox from here
- Download Linux Distribution, in this example we use CentOS 8 from here (ISO file)
- Install CentOS 8 in VirtualBox following the instructions here or here.
By default, the minimal installation will be selected during the installation process. Change the settings to install “Server” (no GUI).
Note: if you want to install the 64bit version and you cannot find it in the list, this means that virtualization is not possible with your CPU or that virtualization is not enabled in the BIOS of your mainboard. In the latter case, restart your PC, go to BIOS and activate virtualization.
Enable SSH access to the virtual machine
- Close the VM if it is still running.
- In VirtualBox, select the VM, then go to Settings.
- Go to Network > Advanced > Port Forwarding
- Add a new rule and input in the fields:
Guest IP: leave empty
- Click OK
Enable access to the InfluxDB database running on the VM
- Add a new rule and input in the fields:
Guest IP: leave empty
- Click OK
- Start the VM
Test SSH access with FileZilla
In this example, we connect to the VM using SSH and FileZilla.
- Start FileZilla and go to File > Site Manager > New Site and enter a name for the connection, e.g. “centos_indev”
- With “centos_indev” selected, adjust the settings for the connection in the General tab:
Protocol: SFTP – SSH File Transfer Protocol
- Click Connect to establish connection.
Helpful: An A-Z Index of the Bash command line for Linux
For more info about the ports used by ssh and influxd, you can use:
sudo ss -nlput ¦ grep sshd
sudo ss -nlput ¦ grep influxd
Also helpful in this context is:
Step 2: Database
Installing InfluxDB 2.0 on CentOS 8
- On the virtual machine in the terminal, download and install InfluxDB: (source)
sudo yum localinstall influxdb-2.0.1.x86_64.rpm
- Set up InfluxDB through the influx CLI: (source)
- Start and enable InfluxDB service on CentOS 8 / VirtualBox: (source)
sudo systemctl enable --now influxdb
- Check status to confirm it is running:
systemctl status influxdb
Configure InfluxDB firewall on CentOS 8
- Open port on the firewall: (source)
sudo firewall-cmd --add-port=8086/tcp --permanent
sudo firewall-cmd --reload
- Restart influxdb service:
sudo systemctl restart influxdb
Access InfluxDB via browser
- In the browser, type
Creating a database in InfluxDB 1.6 In the Terminal, we now connect to the CLI (command line interface) of InfluxDB. It will automatically connect to the local InfluxDB instance:
In the CLI, create a database with the name influx_indev_db:
CREATE DATABASE "influx_indev_db"
Now let’s list all available databases, the newly created database will now show up:
Set the new database for all future requests:
Now future commands will only be run against the “influx_indev_db”.
Insert some test data in InfluxDB 1.6 In the CLI, for testing, we insert a single-point time-series datapoint:
INSERT air_temperature,host=instrument_a,region=europe value=24.5
Now query the data we just wrote:
SELECT "host", "region", "value" FROM "air_temperature"
Let’s try storing another type of data, with two fields in the same measurement:
INSERT temperature,machine=unit42,type=assembly external=25,internal=37
To return all fields and tags with a query, you can use the * operator:
SELECT * FROM "temperature"
Configuring an admin user for InfluxDB In CLI, create admin user:
CREATE USER paul WITH PASSWORD 'timeseries4days' WITH ALL PRIVILEGES
Now we need to enable authentication. In Terminal, open config file:
sudo vim /etc/influxdb/influxdb.conf
In this file, enable authentication by setting the
[http]section (to edit press ‘i’).
Save and quit the file:
Stop and then re-start InfluxDB:
sudo systemctl stop influxdb
sudo systemctl start influxdb
Now we cannot do anything without logging in, therefore we have to login as the admin user. In CLI, login by typing:
Then enter username and password. In CLI, list all databases:
Bonus: In the Terminal, authenticate with a simple query:
curl -G "http://localhost:8086/query?u=paul&p=timeseries4days" --data-urlencode "q=SHOW DATABASES"
Check disk space used by a specific database In the Terminal, type:
sudo du -sh /var/lib/influxdb/data/db1
Open port in CentOS firewall We want to connect to the database running on the CentOS VM from the host system. InfluxDB by default is listening on port 8086, but before we can access the database via this port, we have to unblock the port in the CentOS VM firewall settings. If we don’t do this, nothing will work from outside. Here is how I did it: In the VM GUI, go to Firewall Configuration Then select Zones > public > Ports Click Add to open a specific port and enter
8086(this is tcp)
The port should now be open and reachable. To check this, go to the Terminal and check which ports are open in the VM firewall with this command:
This should show
8086/tcpand the influx database should now be reachable from outside via IP
For more info go here.
Step 3: Python
Installing conda In the VM, go to the Terminal and download Miniconda:
Install Miniconda and accept adding to PATH when prompted:
Close Terminal Re-open Terminal Conda should now be accesible, test by checking the conda version:
Set up environment Get a list of all conda environments:
conda info --envs
If you have just installed conda, you only have the default environment base. Instead of installing new packages for base, we will use environments. Create a new environment called indev:
conda create --name indev
Activate the environment indev:
source activate indev
Get a list of environments like in point 1. This will now also list the new environment indev. Now there is a star next to indev, which shows that it is the currently active environment. Now install the full Anaconda distribution in the environment indev (this will take some minutes):
conda install anaconda
Now check which packages are installed (a lot!):
Execute Python script Let’s say the Python script is on your Windows computer in a specific folder, e.g. E:\stuff\work\programming\test\script.py To execute this script, type:
Installing influxdb package for Python In the Terminal with conda installed, type:
pip install influxdb
Accessing the database from host Accessing the database (running on the CentOS VM) from outside with Python (running on a Windows 10 PC). This worked for me. Remember to open port 8086 in the CentOS VM firewall or nothing will work! This short script creates a new database with name
client = influxdb.InfluxDBClient(host='127.0.0.1', port=8086, username='your_username', password='your_insanely_secret_password')
The influxdb Python library Getting Started / API / influxdb on GitHub / API: DataFrameClient object in influxdb