Search
Linux Testing Environment
Update 19 Nov 2020:
- This guide has been updated for use with CentOS 8 and the newly released InfluxDB 2.x
Setting up a Linux testing environment using CentOS 8, InfluxDB 2.0.1 and the Anaconda Distribution in VirtualBox.
Contents
- 1 Things to know
- 2 Step 1: Operating System VM
- 3 Step 2: Database
- 3.1 Installing InfluxDB 2.0 on CentOS 8
- 3.2 Configure InfluxDB firewall on CentOS 8
- 3.3 Access InfluxDB via browser
- 3.4 Creating a database in InfluxDB 1.6
- 3.5 Insert some test data in InfluxDB 1.6
- 3.6 Configuring an admin user for InfluxDB
- 3.7 Check disk space used by a specific database
- 3.8 Open port in CentOS firewall
- 4 Step 3: Python
Things to know
Anaconda
From conda docs glossary:
A downloadable, free, open source, high-performance and optimized Python and R distribution. Anaconda includes conda, conda build, Python and 100+ automatically installed, open source scientific packages and their dependencies that have been tested to work well together, including SciPy, NumPy and many others. Use the
conda install
command to easily install 1,000+ popular open source packages for data science—including advanced and scientific analytics—from the Anaconda repository. Use theconda
command to install thousands more open source packages.Because Anaconda is a Python distribution, it can make installing Python quick and easy even for new users.
Available for Windows, macOS and Linux, all versions of Anaconda are supported by the community.
More info here: Wikipedia / Official website
CentOS 7
From the official website:
The CentOS Project is a community-driven free software effort focused on delivering a robust open source ecosystem. For users, we offer a consistent manageable platform that suits a wide variety of deployments. For open source communities, we offer a solid, predictable base to build upon, along with extensive resources to build, test, release, and maintain their code.
More info here: Official website
Conda
From conda docs glossary:
The package and environment manager program bundled with Anaconda that installs and updates conda packages and their dependencies. Conda also lets you easily switch between conda environments on your local computer.
More info here: Official conda docs
InfluxDB
From the official docs:
InfluxDB is a time series database designed to handle high write and query loads. It is an integral component of the TICK stack. InfluxDB is meant to be used as a backing store for any use case involving large amounts of timestamped data, including DevOps monitoring, application metrics, IoT sensor data, and real-time analytics.
More info here: Official InfluxDB docs / Official website
VirtualBox
From the official manual:
VirtualBox is a cross-platform virtualization application. What does that mean? For one thing, it installs on your existing Intel or AMD-based computers, whether they are running Windows, Mac, Linux or Solaris operating systems. Secondly, it extends the capabilities of your existing computer so that it can run multiple operating systems (inside multiple virtual machines) at the same time. So, for example, you can run Windows and Linux on your Mac, run Windows Server 2008 on your Linux server, run Linux on your Windows PC, and so on, all alongside your existing applications. You can install and run as many virtual machines as you like — the only practical limits are disk space and memory. VirtualBox is deceptively simple yet also very powerful. It can run everywhere from small embedded systems or desktop class machines all the way up to datacenter deployments and even Cloud environments.
More info here: Official manual
Step 1: Operating System VM
Install Linux in VirtualBox
- Install VirtualBox from here
- Download Linux Distribution, in this example we use CentOS 8 from here (ISO file)
- Install CentOS 8 in VirtualBox following the instructions here or here.
By default, the minimal installation will be selected during the installation process. Change the settings to install “Server” (no GUI).
Note: if you want to install the 64bit version and you cannot find it in the list, this means that virtualization is not possible with your CPU or that virtualization is not enabled in the BIOS of your mainboard. In the latter case, restart your PC, go to BIOS and activate virtualization.
Enable SSH access to the virtual machine
- Close the VM if it is still running.
- In VirtualBox, select the VM, then go to Settings.
- Go to Network > Advanced > Port Forwarding
- Add a new rule and input in the fields:
Host IP:127.0.0.1
Host Port:2222
Guest IP: leave empty
Guest Port:22
- Click OK
Enable access to the InfluxDB database running on the VM
- Add a new rule and input in the fields:
Host IP:127.0.0.1
Host Port:8086
Guest IP: leave empty
Guest Port:8086
- Click OK
- Start the VM
Test SSH access with FileZilla
In this example, we connect to the VM using SSH and FileZilla.
- Start FileZilla and go to File > Site Manager > New Site and enter a name for the connection, e.g. “centos_indev”
- With “centos_indev” selected, adjust the settings for the connection in the General tab:
Protocol: SFTP – SSH File Transfer Protocol
Host:127.0.0.1
Port:2222
Logon Type:Normal
User:your_username
Password:your_password
- Click Connect to establish connection.
Helpful: An A-Z Index of the Bash command line for Linux
For more info about the ports used by ssh and influxd, you can use:
sudo ss -nlput ¦ grep sshd
sudo ss -nlput ¦ grep influxd
Also helpful in this context is:
netstat -tln
Step 2: Database
Installing InfluxDB 2.0 on CentOS 8
- On the virtual machine in the terminal, download and install InfluxDB: (source)
wget https://dl.influxdata.com/influxdb/releases/influxdb-2.0.1.x86_64.rpm
sudo yum localinstall influxdb-2.0.1.x86_64.rpm - Set up InfluxDB through the influx CLI: (source)
influx setup
- Start and enable InfluxDB service on CentOS 8 / VirtualBox: (source)
sudo systemctl enable --now influxdb
- Check status to confirm it is running:
systemctl status influxdb
Configure InfluxDB firewall on CentOS 8
- Open port on the firewall: (source)
sudo firewall-cmd --add-port=8086/tcp --permanent
sudo firewall-cmd --reload
- Restart influxdb service:
sudo systemctl restart influxdb
Access InfluxDB via browser
- In the browser, type
http://localhost:8086
Creating a database in InfluxDB 1.6
In the Terminal, we now connect to the CLI (command line interface) of InfluxDB. It will automatically connect to the local InfluxDB instance:
influx
In the CLI, create a database with the name influx_indev_db:
CREATE DATABASE "influx_indev_db"
Now let’s list all available databases, the newly created database will now show up:
SHOW DATABASES
Set the new database for all future requests:
USE "influx_indev_db"
Now future commands will only be run against the “influx_indev_db”.
Insert some test data in InfluxDB 1.6
In the CLI, for testing, we insert a single-point time-series datapoint:
INSERT air_temperature,host=instrument_a,region=europe value=24.5
Now query the data we just wrote:
SELECT "host", "region", "value" FROM "air_temperature"
Let’s try storing another type of data, with two fields in the same measurement:
INSERT temperature,machine=unit42,type=assembly external=25,internal=37
To return all fields and tags with a query, you can use the * operator:
SELECT * FROM "temperature"
Configuring an admin user for InfluxDB
In CLI, create admin user:
CREATE USER paul WITH PASSWORD 'timeseries4days' WITH ALL PRIVILEGES
Now we need to enable authentication. In Terminal, open config file:
sudo vim /etc/influxdb/influxdb.conf
In this file, enable authentication by setting theauth-enabled
option totrue
in the[http]
section (to edit press ‘i’).Save and quit the file:
:wq
Stop and then re-start InfluxDB:
sudo systemctl stop influxdb
sudo systemctl start influxdb
influxd
Start CLI:
influx
Now we cannot do anything without logging in, therefore we have to login as the admin user. In CLI, login by typing:
auth
Then enter username and password.In CLI, list all databases:
SHOW DATABASES
Bonus: In the Terminal, authenticate with a simple query:
curl -G "http://localhost:8086/query?u=paul&p=timeseries4days" --data-urlencode "q=SHOW DATABASES"
Check disk space used by a specific database
In the Terminal, type:
sudo du -sh /var/lib/influxdb/data/db1
Open port in CentOS firewall
We want to connect to the database running on the CentOS VM from the host system. InfluxDB by default is listening on port 8086, but before we can access the database via this port, we have to unblock the port in the CentOS VM firewall settings. If we don’t do this, nothing will work from outside. Here is how I did it:
In the VM GUI, go to Firewall ConfigurationThen select Zones > public > PortsClick Add to open a specific port and enter8086
(this is tcp)The port should now be open and reachable. To check this, go to the Terminal and check which ports are open in the VM firewall with this command:
firewall-cmd --list-portsThis should show8086/tcp
and the influx database should now be reachable from outside via IP127.0.0.1
and port8086
.
For more info go here.
Step 3: Python
Installing conda
In the VM, go to the Terminal and download Miniconda:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.shInstall Miniconda and accept adding to PATH when prompted:
bash Miniconda3*.sh
Close TerminalRe-open TerminalConda should now be accesible, test by checking the conda version:
conda --version
Set up environment
Get a list of all conda environments:
conda info --envs
If you have just installed conda, you only have the default environment base. Instead of installing new packages for base, we will use environments.Create a new environment called indev:
conda create --name indev
Activate the environment indev:
source activate indev
Get a list of environments like in point 1. This will now also list the new environment indev. Now there is a star next to indev, which shows that it is the currently active environment.Now install the full Anaconda distribution in the environment indev (this will take some minutes):
conda install anaconda
Now check which packages are installed (a lot!):
conda list
Execute Python script
Let’s say the Python script is on your Windows computer in a specific folder, e.g. E:\stuff\work\programming\test\script.py
To execute this script, type:
python /mnt/e/stuff/work/programming/test/script.py
Installing influxdb package for Python
In the Terminal with conda installed, type:
pip install influxdb
Accessing the database from host
Accessing the database (running on the CentOS VM) from outside with Python (running on a Windows 10 PC). This worked for me. Remember to open port 8086 in the CentOS VM firewall or nothing will work!
This short script creates a new database with name so_many_datapoints
:
import influxdb
client = influxdb.InfluxDBClient(host='127.0.0.1', port=8086, username='your_username', password='your_insanely_secret_password')
client.create_database('so_many_datapoints')
The influxdb Python library
Getting Started / API / influxdb on GitHub / API: DataFrameClient object in influxdb
Last Updated on 19 Nov 2020 16:19