gehttpd¶
Introduction¶
gehttpd is a lightweight multi-threaded web server written in C++/Qt and directly integrated with the GAUSS Engine. It allows users to implement their algorithms in the GAUSS matrix programming language, and run those procedures with arguments supplied by requests to the web server.
Although intended to be used behind another web server such as Apache or nginx as a Reverse Proxy, it can be used standalone as well. Other notable features are:
- Session handling (cookies for maintaining state per user)
- SSL support
- Logging
- Templating (for internal usage)
Below are two popular use-cases where one might use gehttpd to run GAUSS code:
Web browser workflow¶
A common scenario with the web browser client might be scripts executing on a page with real-time input from the user to run algorithms return the result.
Business Intelligence workflow¶
Note that BI software is able to interface directly in a scenario such as this as the application server would be running on an internal network, but placing Apache or nginx in front allows for greater flexibility and security.
Installation¶
All file paths specified will assume the root is the directory gehttpd
was
installed to.
- Extract the gehttpd archive file (
gehttpd.tar.gz
orgehttpd.zip
) to a directory of your choice. - Rename the file
etc/gehttpd.ini.sample
toetc/gehttpd.ini
for configuration in the next section.
Minimum Requirements¶
Linux¶
Make sure all the necessary dependencies are installed. You can run
ldd gehttpd | grep not
on a Linux machine to check which dependencies are
missing. Keep in mind the run.sh
script will set appropriate environment
variables for locating dependencies bundled with gehttpd. ldd
should be used
for locating missing system dependencies.
- Tested on Debian 9 and 10, as well as CentOS 6 and 7.
- Ubuntu and OpenSuSE should also run it as well, though not all versions have been tested.
For Debian based (e.g. Ubuntu systems);
sudo apt install libglu1-mesa libxcomposite1 libxrender1 libfontconfig1
xvfb
may be required on headless systems:
sudo apt install libxi6 xvfb
In lieu of that, enable user namespace cloning in the kernel with the following command:
sudo sysctl -w kernel.unprivileged_userns_clone=1
Windows¶
- Tested on Windows 8 and 10.
- Usage with Windows 7 or prior is not guaranteed.
Quick Start¶
First copy the template configuration file etc/gehttpd.ini.sample
to
etc/gehttpd.ini
.
After the previous Installation
instructions are complete, only one key
modification is required to the etc/gehttpd.ini
file to run the
provided example files.
Set the value of the [gehttpd] > home key to the installation directory of the GAUSS Engine:
[gehttpd] home=/home/research/gauss20
Any custom routing files must be specified in the [gettpd] > filenames key. To use the example routes, the key can be left with its default value.
Once this is complete, the section Running the Server
can be read.
Configuration¶
Ensure that you have copied the template configuration file
etc/gehttpd.ini.sample
to etc/gehttpd.ini
.
The following are all sections listed sequentially in the
etc/gehttpd.ini
configuration file and list each key’s default value.
[listener]¶
;host=127.0.0.1
port=5050
minThreads=4
maxThreads=100
cleanupInterval=60000
readTimeout=60000
;sslKeyFile=ssl/key.pem
;sslCertFile=ssl/cert.pem
maxRequestSize=16000
maxMultiPartSize=10000000
asynchronous=1
host: | The network interface to listen on when starting the web server. The default behavior is to bind to all available network interfaces. Uncomment the line and use 127.0.0.1 to prevent responding to requests from any outside sources. |
---|---|
port: | The default port to bind to. If running behind Apache or nginx as a Reverse Proxy, do not set this to 80 or 443. |
minThreads: | The minimum number of threads to be reserved by the connection
pool when waiting for requests. Ignored if asynchronous is set to 0. |
maxThreads: | The maximum number of threads allowed for the connection pool to
spawn when responding to requests. Ignored if asynchronous is set to 0. |
cleanupInterval: | |
How often the connection pool is cleaned up. Removes idle threads until minThreads is reached. Ignored if asynchronous is set to 0. |
|
readTimeout: | Duration connection will remain idle before timing out automatically. |
sslKeyFile: | Path to the SSL private key. Expects RSA key in PEM format. |
sslCertFile: | Path to the local SSL certificate. Expects PEM format. |
maxRequestSize: | The maximum size of a HTTP request in bytes. |
maxMultiPartSize: | |
The maximum size of the body of a multipart/form-data HTTP request in bytes. | |
asynchronous: | Whether a thread pool is used for handling incoming requests. If this is set 0 then all incoming requests are handled sequentially and run in the main server thread. This should be used for special use-cases where the app server needs all requests to run through the main thread, such as running procedures that generate graphs. |
[gehttpd]¶
home=/home/research/mteng20
filenames=test/sample.e|test/echo.e
persistentWorkspace=0
home: | The full path to the installation directory of the GAUSS Engine. This is required and gehttpd will fail to run if not set properly. |
---|---|
filenames: | A list of filenames separated by the pipe At least one filename must be provided. |
persistentWorkspace: | |
Boolean flag (0 or 1) that determines whether state is maintained for clients. This allows a GAUSS workspace to be saved in between requests by clients and the ability for the state of the workspace after the last request to be used for new requests. Depending on the type of work performed this can potentially be large, especially with numerous clients. If not enabled, a fresh workspace and/or workspace with specified initial state will be used when handling new requests. Use with caution. |
[templates]¶
path=templates
suffix=.tpl
encoding=UTF-8
cacheSize=1000000
cacheTime=60000
path: | The path where the template files can be found. This can be either an
absolute path or relative path from the gehttpd/etc directory. |
---|---|
suffix: | The default suffix of the template files. |
encoding: | The encoding that is sent to the web browser in case of text files. |
cacheSize: | The size of the server cache in bytes. |
cacheTime: | The duration in milliseconds of cache items before they expire. Setting this to 0 will disable expiration. |
[docroot]¶
path=docroot
encoding=UTF-8
maxAge=60000
cacheTime=60000
cacheSize=1000000
maxCachedFileSize=65536
path: | The path where the HTML files can be found. This can be either an
absolute path or relative path from the gehttpd/etc directory. |
---|---|
encoding: | The encoding that is sent to the web browser in case of text files. |
maxAge: | The duration in milliseconds the file should reside in the browsers cache. Setting this to 0 will disable expiration. |
cacheTime: | The duration in milliseconds of cache items before they expire. Setting this to 0 will disable expiration. |
cacheSize: | The size of the server cache in bytes. |
maxCachedFileSize: | |
The maximum size in bytes of a specific file in the server cache. |
[sessions]¶
expirationTime=3600000
cookieName=gehttpd
cookiePath=/
cookieComment=Stores persistent GAUSS workspace
;cookieDomain=
expirationTime: | The duration in milliseconds before the cookie will expire. |
---|---|
cookieName: | The name of the cookie to store in the browser. |
cookiePath: | The URL path where the session is valid. This is useful when you
have data not related to the session in /static/ and session related data
in /content/ or similar. |
cookieComment: | A comment to describe the cookie. This may be displayed by the client’s web browser somewhere. |
cookieDomain: | The domain the cookie will be defined to. Defaults to the current domain (determined by the client browser). |
[logging]¶
fileName=../logs/gehttpd.log
minLevel=1
bufferSize=100
maxSize=1000000
maxBackups=2
timestampFormat=dd.MM.yyyy hh:mm:ss.zzz
msgFormat={timestamp} {typeNr} {type} {thread} {msg}
; QT5 supports: msgFormat={timestamp} {typeNr} {type} {thread} {msg}\n in {file} line {line} function {function}
fileName: | The file path where the log content will be written. This can be an
absolute path or a path relative from the |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
minLevel: | Minimum level of message types that are written out directly or trigger writing the buffered content. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
bufferSize: | Defines the size of the ring buffer in bytes. Setting this to 0 means unlimited. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
maxSize: | The maximum size of the log file in bytes. The file will be backed up and replaced by a new file if it becomes larger than this limit. Please note that the actual file size may become a little bit larger than this limit. Setting this to 0 means unlimited. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
maxBackups: | The number of backup files to keep. Setting this to 0 means unlimited. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
timestampFormat: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The format of timestamps. These expressions may be used for the date:
These expressions may be used for the time:
All other input characters will be ignored. Any sequence of characters that are enclosed in single quotes will be treated as text and not be used as an expression. Two consecutive single quotes (“’‘”) are replaced by a single quote in the output. Formats without separators (e.g. “HHmm”) are currently not supported. Example format strings (assumed that the date/time is 21 May 2001 14:13:09.120):
If the datetime is invalid, an empty string will be returned. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
msgFormat: | The format of log message entries. The following variables may be used in the message and in msgFormat:
|
Define URL Endpoints (Routes)¶
Endpoints are the core of request-handling in gehttpd, and are simply definitions for how to direct requests to the appropriate GAUSS procedures to handle them and return a response.
A sample web request might have the following steps:
- The client sends a request to gehttpd.
- gehttpd looks at the URL in the request and determines which GAUSS procedure to run. The arguments from the request are passed to the GAUSS procedure and the function is evaluated.
- The result is encoded in the requested format (json, xml, or raw [plain]) and sent back to the client.
We define these endpoints directly in GAUSS code, and the files that make up
these definitions are specified the filenames
key of the
gehttpd.ini
configuration file.
A URL route to a GAUSS proc mapping consists of 3 steps:
- Include the
gehttpdroutes.src
file. This is only necessary once per source file. - Define the GAUSS procedure (proc)
- Use the
route()
function to tell gehttpd how to handle the proc and associated URL path.
The route()
function takes the following form:
Format¶
route(url, procname, args);
Parameters¶
url: | string, the relative path used to access the GAUSS proc from a client.
Always starts with a forward-slash |
||||||||
---|---|---|---|---|---|---|---|---|---|
procname: | string, the name of the GAUSS proc to execute when the specified URL is accessed. |
||||||||
args: | string, ordered comma-delimited list of arguments that the proc accepts. These argument names must match the web request parameter names exactly. This argument determines the correct ordering when calling the GAUSS proc. Note Prefixing an argument with a specific character can change the way
gehttpd treats the argument when processing the request. The below table
shows the available prefixes and the corresponding type coercion that
occurs. This allows routes to ensure that arguments come in as the
anticipated type. The exception to this is the
As a quick example, specifying an argument as |
The following 2 styles are understood by GAUSS when defining routes.
Style 1: GET/POST arguments¶
This style is the traditional method of passing arguments, whether in the URL
through ?key=value&key1=value1
pairs in a GET request, or in the body of
a POST request.
Addition example¶
// Step 1: Include the gehttpdroutes.src file
#include gehttpdroutes.src
// Step 2: Define the GAUSS proc
proc (1) = my_add_proc(a, b);
retp(a+b);
endp;
// Step 3: Instruct gehttpd how to handle this proc.
route("/add", "my_add_proc", "a,b");
Forced string example¶
// Step 1: Include the gehttpdroutes.src file
#include gehttpdroutes.src
// Step 2: Define the GAUSS proc
proc (1) = hello(name);
retp("Hello " $+ name);
endp;
// Step 3: Instruct gehttpd how to handle this proc.
route("/hello", "hello", "$name");
File upload example¶
Note: File uploading is only supported for multipart fields in a POST request. The uploaded file is saved to a temporary file and the filename is passed to the proc.
#include gehttpdroutes.src
proc (0) = my_dstatmt_proc(file, formula);
call dstatmt(file, formula);
endp;
route("/dstatmt", "my_dstatmt_proc", "@file, $formula");
Style 2: RESTful arguments¶
This style passes arguments in the URL itself, as shown. Because they are passed in this manner, only numeric scalars and strings can be passed with this style.
Argument types¶
By default, GAUSS treats an argument supplied in a REST request as a numeric
scalar. Remember that the route()
proc allows marking arguments as a
string by prefixing them with the $
character, as follows:
Forced string example¶
// Step 1: Include the gehttpdroutes.src file
#include gehttpdroutes.src
// Step 2: Define the GAUSS proc
proc (1) = hello(name);
retp("Hello " $+ name);
endp;
// Step 3: Instruct gehttpd how to handle this proc.
route("/hi/<name>", "hello", "$name");
Addition example¶
Note the multiple ways of calling the same GAUSS proc.
// Step 1: Include the gehttpdroutes.src file
#include gehttpdroutes.src
// Step 2: Define the GAUSS proc
proc (1) = my_add_proc(a, b);
retp(a+b);
endp;
// Step 3: Instruct gehttpd how to handle this proc.
route("/add/<a>/<b>", "my_add_proc", "a,b");
// Alternative
route("/add/<a>/plus/<b>", "my_add_proc", "a,b");
// Alternative
route("/<a>/plus/<b>", "my_add_proc", "a,b");
Running the Server¶
Once the etc/gehttpd.ini
file has been configured, gehttpd can be
started with the provided run.sh
(Linux) or run.bat
file.
$ ./run.sh
C:\gehttpd> run.bat
Executing this in the terminal will set the appropriate environment variables
and start the server. It is recommended that at least on Linux this is run in
a screen
or tmux
session, as only launching as a background task with
&
will still end the process upon logging out of the ssh session that
started it.
If executing gehttpd with xvfb
, the following command use a 1920x1080
buffer by default:
$ xvfb-run ./run.sh
Making a Request¶
The following are valid methods of constructing a request that gehttpd will understand.
GET / RESTful requests¶
GET The most basic method: arguments are passed in the URL as key=value
pairs,
starting after a ?
character, with additional pairs separated by the &
character.
$ curl -X GET http://localhost:5050/hello?name=Bob
$ curl -X GET http://localhost:5050/add?a=5&b=10
RESTful requests are technically just requests using the GET method, but embed
the key in locations of the URL indicated by the predefined route
pattern.
Example from the previous add route:
$ curl -X GET http://localhost:5050/add/5/10
$ curl -X GET http://localhost:5050/add/5/plus/10
POST request¶
POST requests embed the key=value pairs in the body of the request. They are not visible directly in the URL.
$ curl -X POST -d "name=Bob" http://localhost:5050/hello
$ curl -X POST -d "a=5&b=10" http://localhost:5050/add
POST request with JSON arguments¶
POST requests with key=JSON and full JSON body POST requests allow specifying more complex symbol types, such as matrices and string arrays. We can send multiple elements and their corresponding dimensions and type with the following JSON structure:
{
"type": "scalar/matrix/string/string array",
"rows": n,
"cols": m,
"data": ...
}
type: | string representing the GAUSS symbol type. Valid options are:
|
---|---|
rows: | numeric scalar representing number of rows. Must be conformable to size
of |
cols: | numeric scalar representing number of columns. Must be conformable to
size of |
data: | appropriate JSON data type based on symbol type described in |
A few examples are below:
Numeric vector argument¶
{
"type": "matrix",
"data": [1, 2, 3, 4]
}
We can include the rows
and cols
keys to specify the dimensions:
Matrix with rows and columns¶
{
"type": "matrix",
"data": [1, 2, 3, 4],
"rows": 2,
"cols": 2
}
String array¶
{
"type": "string array",
"data": ["M", "F", "M", "M"]
}
string¶
{
"type": "string array",
"data": ["M", "F", "M", "M"]
}
POST request with JSON body¶
Quite possibly the most flexible for dealing with various symbols, a full JSON body gives us quite a bit of flexibility and due to many languages being able to convert a dictionary to a JSON object, ease-of-use passing data to gehttpd.
Note that because we’re using the body of the request, the JSON document keys represent the input argument names:
{
"name": {
"type": "scalar/matrix/string/string array",
"rows": n,
"cols": m,
"data": ...
},
...
}
name: | The name of the input argument to the proc to be called. This must match
one of the input names specified in the |
---|---|
type: | string representing the GAUSS symbol type. Valid options are:
Optional |
rows: | numeric scalar representing number of rows. Must be conformable to size
of |
cols: | numeric scalar representing number of columns. Must be conformable to
size of |
data: | appropriate JSON data type based on symbol type described in |
$ curl -X POST -H "Content-Type: application/json" -d "{\"name\": \"Bob\"}" http://localhost:5050/hello
$ curl -X POST -H "Content-Type: application/json" -d "{\"a\": 5, \"b\": 10}" http://localhost:5050/add
$ curl -X POST -H "Content-Type: application/json" -d "{\"a\": {\"type\": \"matrix\", \"data\": [1, 2, 3, 4], \"rows\": 2, \"cols\": 2}, \"b\": 10}" http://localhost:5050/add
gehttpd can also deduce a pure numeric vector and automatically treat it as a matrix:
$ curl -X POST -H "Content-Type: application/json" -d "{\"a\": [1, 2, 3, 4], \"b\": 10}" http://localhost:5050/add
or as a string array vector:
$ curl -X POST -H "Content-Type: application/json" -d "{\"name\": [\"Bob\", \"Alice\", \"Mike\"]}" http://localhost:5050/hello
Persistent requests with cURL (Sessions)¶
Testing requests that make use of persistence, or sessions, like the example illustrated in sample.e
, requires additional arguments to cURL.
Note that persistence is disabled by default, but can be enabled in the gehttpd.ini
file by modifying the following value:
persistentWorkspace=1
Per the cURL documentation, we can include both the -b
and -c
flags. These instruct cURL to use the specified file
for both reading cookie information prior to the request, and writing received cookies after a request is completed. Using these
flags, we can mimic the behavior of a browser, which does this automatically for us.
# Increase x by 1 and show current value
curl -b gehttpdcookie -c gehttpdcookie -X GET http://localhost:5050/inc
# Increase x by 1 and show current value
curl -b gehttpdcookie -c gehttpdcookie -X GET http://localhost:5050/inc
# Show final value of x
curl -b gehttpdcookie -c gehttpdcookie -X GET http://localhost:5050/whatisx
The Response¶
A successful response will by default return a JSON object with the following structure:
{
"success": true,
"output": "",
"results": [
{
"type": "matrix",
"rows": 5,
"cols": 1,
"data": [1.0,2.0,3.0,4.0,5.0],
"complex": [6.0,7.0,8.0,9.0,10.0]
}
]
}
success: | true if the proc returned without error and false if the proc
was unable to successfully complete. |
---|---|
output: | any output emitted by the program while running. This would include
the contents of any print() (implicit or explicit) statement(s) in the
program. |
results: | an array of return values represented as JSON objects from the proc. Will always return an array. The JSON objects will follow the same structure as JSON objects accepted in the section POST request with JSON arguments |
Note
Complex values are currently only supported in the response.
Response Formats¶
While the Request mandates the inputs be JSON or raw for scalar types, the response allows multiple formats. By default, JSON will be the selected response format.
To change the Response format, supply the argument fmt
to the Request and
specify one of the following as the value:
- json
- xml
- raw
Note
If raw is specified, no markup will be performed and the output, along with the proc results, will be returned as plain text.
Specify format example¶
To specify the response type in a GET request:
$ curl -X GET http://localhost:5050/hello?name=Bob&fmt=xml
To specify the response type in a POST request:
$ curl -X POST -H "Content-Type: application/json" -d "{\"name\": \"Bob\", \"fmt\": \"raw\"}" http://localhost:5050/hello
XML Response structure¶
An XML response will mimic that of a JSON response:
<?xml version="1.0">
<return>
<success>1</success>
<output>...</output>
<results>
<result type="matrix" rows="5" cols="1">
<data>
<values>
<value>1.0</value>
<value>2.0</value>
<value>3.0</value>
<value>4.0</value>
<value>5.0</value>
</values>
<complex-values>
<value>1.0</value>
...
<value>5.0</value>
</complex-values>
</data>
</result>
<result ...>
...
</result>
</results>
</return>
Troubleshooting¶
Please contact support@aptech.com or submit a support ticket at https://www.aptech.com/support/submit-support-ticket/ for additional help.