Thursday, January 19, 2012

Postgresql and HAProxy queuing

Imagine that you have a postgresql server that is capable of handling 10 SQL queries over psql (Network TCP ) at once, without service degrading. ie the performance of postgresql does not slow down, and each query is fulfilled within an expected and acceptable time.

Imagine that as the number of concurrent SQL queries increases over and above 10, that the performance degrades and each query takes longer to fulfil. Eventually the queries take an excessive amount of time that is not acceptable for good service delivery. Often if we could simply queue the SQL queries (psql sessions) and allow only say 10 concurrent queries active at any given time, performance overal will be better.

Note that an assumption is made that each new TCP _sessions_  equates to one discrete SQL query. In my situation, it does, and so queuing TCP session will effectively queue SQL queries. 

One way to manage queuing is with HAproxy. I use HAProxy for its more common purpose - Load balancing HTTP service traffic across many web servers. But I also use it for plain old TCP load management too, simply becuse its there and available.

HAproxy can run in HTTP mode or in plain TCP mode. In TCP mode, HAPRoxy can manage the psql sessions without any protocol 'intelligence' which is exactly what I need. I do not use HAProxy to load balance psql sessions to multiple postgresql back ends by the way. It is simply used as a queuing mechanism for service load management.

An example configuration:

frontend pg_frontend
        bind 192.168.0.140:5432
        mode tcp
        option tcplog
        default_backend pg_backend
        maxconn 20


backend pg_backend
        # The balance algo won't matter
        balance roundrobin
        mode tcp
        option tcplog
        fullconn 15
        timeout queue 120000
        server pg1 192.168.0.142:5432 minconn 10 maxconn 20 check



The front end section 'pg_frontend'  binds on 5432 the usual postgresql socket, on IP 192.168.0.140. We set the front end to run in tcp mode, turn on tcp logging, and tell it to use pg_backend as the default back end. I have maxconn for the front end set to 20, at which the tcp backlog of the operating system's tcp stack will queue up OR drop.



The pg_backend is where the queuing is performed by HAproxy. I define a balance algorithm even though it is not used for a single server like this, it is needed by HAproxy configuration syntax rules. The mode is again TCP, and I have tcp logging turned on.

The key settings for queuing are fullconn, minconn and maxconn. Under normal circumstances, the back end server will be given up to minconn (10)  sessions concurrently. Sessions will queue up if they exceed 10. Once the concurrent sessions active and queued reaches 15 (fullconn), the HAProxy will begin increasing the number of concurrent sessions to the server, ramping up as the number of concurrent and queued sessions approach a total of 20. Once 20 concurrent sessions are active, any sessions in excess are left to the operating system's TCP backlog to manage or drop.

The last configuration directive is timeout queue 120000 - It is a safety net, allowing sessions to remain in the HAProxy backend queue for up to 2 minutes before expiring and being dropped.

The effect of fullconn, minconn and maxconn has is to allow concurrent sessions to burst, handling brief periods of higher load, while also queuing up to 5 sessions at any given time when the total number of sessions is 15 or less.

Note that fullconn, maxconn, and minconn at such low values is a little contrived, so as to simplify my explanation and example. I use larger values than this in reality

Thursday, January 21, 2010

How to get USB monitoring working on Fedora 11 with wireshark

libpcap on F11 missed out on the USB monitoring capability that later verisons ( 1.0 and newer) have. Fedora 12 have 1.0.4 of libpcap so should work just fine.

I resolved the problem on Fedora 11 by pulling down the rawhide version of libpcap, building it on my F11 system, installing the new libpcap and libpcap-devel packages, and then rebuilding wireshark so that it used the newer libpcap.

The basic steps:

su -c 'yum install yum-utils'
su -c yum install rpmdevtools'

As a normal user, run rpmdev-setuptree from the rpmdevtools package. It creates the ~/rpmbuild directory ready to build packages as a NON-ROOT user. Very important :)


Now as the normal user get the libpcap package src.rpm ( I usually cd ~/rpmbuild and keep these files there):

cd ~/rpmbuild
yumdownloader libpcap --enablerepo=rawhide --source

Get the wireshark src.rpm too:
yumdownloader wireshark --source

Now get the requisite development packages to build:

su -c 'yum-builddep wireshark-1.2.2-1.fc11.src.rpm'
su -c 'yum-builddep libpcap-1.0.0-5.20091201git117cb5.fc13.src.rpm '


Note that the package file names may be different. At the time I did this the files were named as above.

Install the src.rpm packages . NOTE do this as a normal user, the rpmdev-setuptree command sets up a macro file so that when you install src.rpm files they use the rpmbuild directory.

rpm -ivh libpcap-1.0.0-5.20091201git117cb5.fc13.src.rpm
rpm -ivh wireshark-1.2.2-1.fc11.src.rpm

Now we are ready to build. First libpcap. In the ~/rpmbuild directory:

rpmbuild -ba SPECS/libpcap.spec

Wait for the build to complete. ONce finsihed we need to install the new packages. The packages are placed in ~/rpmbuild/RPMS/ in the directory that matches the architecture you built on.

Now install the libpcap package. This is a bit of a hack, as I had tcpdump and wireshark from Fedora's default repos installed already, and both depend on the specific version of libpcap it comes with. I just removed wireshark and tcpdump:

rpm -e wireshark wireshark-gnome tcpdump

If you are using gnome and the lovely NetworkManager there is also one remaining package that depends on libpcap, the ppp package. I left it where it was as lots of things depend on ppp :). When I installed the new libpcap package I used --nodeps:

cd RPMS/x86_64/
sudo rpm -Uvh libpcap-1.0.0-5.20091201git117cb5.fc11.x86_64.rpm libpcap-devel-1.0.0-5.20091201git117cb5.fc11.x86_64.rpm --nodeps

In general this is a bad idea, but I know I never use ppp so it is an ok risk for me to take. Notice too that the RPMS I built ended up in the x64_64 directory in ~/rpmbuild/RPMS . That is because my system's arch is x86_64.

Now that the libpcap-devel-1.0.5BLAH package is installed, when we build the wireshark packages it will use those libs. The build is similar except that wireshark uses rpath a bit. See this for some background:

http://fedoraproject.org/wiki/RPath_Packaging_Draft

Luckily it is easy enough to turn off the warning and allow the package to be built . In the ~rpmbuid directory again:

QA_RPATHS=$[ 0x0001|0x0010 ] rpmbuild -ba SPECS/wireshark.spec

And wait again. Once the build is finished we can install wireshark packages again and be able to monitor and capture USB packets:

su -c 'rpm -Uvh RPMS/x86_64/wireshark-1.2.2-1.fc11.x86_64.rpm'
su -c 'rpm -Uvh RPMS/x86_64/wireshark-gnome-1.2.2-1.fc11.x86_64.rpm '

Saturday, June 27, 2009

Assign PCI device to a KVM guest for exclusive use under Fedora 11

### Need to figure out how to post XML as code ###

I recently had the need to pass a PCI device - a USB controller - to a KVM guest so that the guest had exclusive use of it. The reason was that I needed to utilise a USB wireless device that connects to a Garmin Forerunner 405 that has tools and software only available in Windows. The windows tools include a driver to access the device. I do intend to work on ways to use the Garmin 405 in Linux only, but for the time being, I need to get some training done :)

Note that this blog post is discussing my specific solution - and workarounds to problems using Fedora 11 (F11) on my Lenovo T500. In particular I had to make sure of the following:

- BIOS has all Virt features enabled
- Running x86_64 (I don't waste time with 32 bit anymore)
- Must boot with intel_iommu=on kernel arg to enable VT-d
- Guest system should be 'kvm' type and using qemu-kvm emulator

The first thing I had to do was find the PCI USB controller, so in my Lenovo T500 I have exactly three USB ports on the left side. I detirmined that the two ports to the rear are attached to one PCI controller, while the front one appears to be attached to another - There are other PCI USB devices in this laptop but they are available via a docking station, or are used for internally attached devices. I found this out by trial and error, plugging in a Western Digital external USB disk into each port, and then running the command :

virsh nodedev-list --tree


I looked for the pci root that the scsi/Western Digital USB device appeared under when one port, then another, was used. I eventually found that the two USB ports toward the rear of the case are attached to one PCI device, pci_8086_293a only. So that PCI device was good one to assign exclusively to the guest.

I ran 'virsh edit win2k3' (win2k3 in the domain for my Windows guest system) and added a device entry in the section:








The numbers for domain, bus and function are available by dumping the XML like so:

# virsh nodedev-dumpxml pci_8086_293a

pci_8086_293a
computer

0
0
29
7
82801I (ICH9 Family) USB2 EHCI Controller #1
Intel Corporation



Simply convert slot and function values to HEX to get the PCI bus addreses, for example:

0 0 == 0 in HEX so bus='0x0000'
29 29 == 1d in HEX so slot='0x1d'
7 7 == 7 in HEX so function='0x7'

Once the guest system is configured to use the PCI address, we need to tell the host system to stop using it. The ehci driver is loaded by default for the USB PCI controller:

$ readlink /sys/bus/pci/devices/0000\:00\:1d.7/driver
../../../bus/pci/drivers/ehci_hcd

Detach the device:

$ virsh nodedev-dettach pci_8086_293a

Verify it is now under the control of pci_stub:

$ readlink /sys/bus/pci/devices/0000\:00\:1d.7/driver
../../../bus/pci/drivers/pci-stub

Set a sebool to allow the management of the PCI device from the guest:

$ setsebool -P virt_manage_sysfs 1

Start the guest system :

virsh start win2k3

Using KVM on Fedora 11, and Intel based systems

I have a small tip for those wishing to use qemu-kvm on Intel based systems - Like my work supplied Lenovo T500.

I found that even though the system supported hardware VT capabilities, I only ever had the qemu emulator available. I never had the opportunity to investigate much further.

When I was investigating implementing PCI device assignement ( allowing a virt guest direct access to a device - Like a PCI USB controller ) I realised my VM's were all non-hardware accelerated plain old qemu emulation, and I needed the ability to assign PCI devices. The only way to do that was to use the qemu-kvm emulator.

The reason is this BZ:

https://bugzilla.redhat.com/show_bug.cgi?id=490477

iommu is required to have VT-d available. iommmu feature is disabled for good reason, but if you are lucky you wont be effected by the bug. To enable it and get full hardware accelerated KVM :

- Edit /etc/grub.conf ( /boot/grub/grub.conf ) and add an arg intel_iommu=on to the kernel boot line, for example mine looks like so:

title Fedora (2.6.29.5-191.fc11.x86_64)
root (hd0,0)
kernel /vmlinuz-2.6.29.5-191.fc11.x86_64 ro root=/dev/mapper/vg_catfood-lv_root intel_iommu=on rhgb quiet
initrd /initrd-2.6.29.5-191.fc11.x86_64.img



After I made this change and rebooted, I was able to edit the guest's configurations using 'virsh edit win2k3.' My guest has a virt domain 'win2k3' . I had to change the domain configuration to use kvm instead of qemu, as well as the qemu-kvm emulator.

My guest systems that use qemu-kvm perform much better now that hardware acceleration is working!!

Saturday, June 20, 2009

Media Centre hardware list

Zalman HD160XT PLUS BLACK HTPC ATX Case from UMART

Intel DG45ID motherboard from UMART

I have chosen the Intel motherboard because it is a low power, compact design for a media centre, it has all the sound and video features I want. But most importantly it has the Intel graphics drivers in the xorg release.

This mother board can handle LGA775 socket type CPU's and being a media centre I don't think I will need much more than a dual core. Not at all sure what to get yet, but something low energy would be good.

Should not need that much memory, but I will get 4GB 800Mhz DDRII possibly

I'd like to get multiple disks, and setup a storage host, with the media centre accessing the storage via network - iSCSI. I am not sure what the performance impact will be compared to local disks.

The most important component apart from the video/graphics display controller is the TV capture/tuner card. I have the Huapage T-500 in mind, all reports so far say it is working well with MythTV :

http://www.techbuy.com.au/p/59739/Hauppauge/NOVA500MCE.asp

Planning a media centre possibly using MythTV - The case

So, looking at putting together a MythTV box, based on Fedora, naturally!!

My draft hardware list includes a Zalman HD160XT PLUS BLACK HTPC ATX Case from UMART

This case has a LCD touch screen which might be useful. Needs evtouch x11 driver, the good guys from Debian have an unstable package that has been patched slightly from the original source to build with Xorg

http://packages.debian.org/source/sid/xf86-input-evtouch

Additional reference, to set up the remote and screens :

http://www.mythtv.org/wiki/Zalman_HD160XT

I have so far manually edited the original source from http://www.conan.de/touchscreen/evtouch.html , basing the changes on what Debian's unstable package has. I have managed to get it to build ok on Fedora 11. The steps follow.

  • Grab a x11 driver package from source
yumdownloader --source xorg-x11-drv-dummy

  • Run yum-builddep on the src.rpm package
yum-builddep xorg-x11-drv-dummy-0.3.1-2.fc11.src.rpm

This pulls in the build dependencies for x11 driver packages.

  • Grab evtouch sources
wget http://www.conan.de/touchscreen/xf86-input-evtouch-0.8.8.tar.bz2

  • Extract the source
tar -jxvf xf86-input-evtouch-0.8.8.tar.bz2

  • Apply the patch ( no patch file available online yet TODO)
The patched driver remains untested as I have no hardware yet !!, Until I get a place to drop the files, here is the patch:


##--CUT HERE--##
diff -Naur xf86-input-evtouch-0.8.8.orig/ev_calibrate.c xf86-input-evtouch-0.8.8/ev_calibrate.c
--- xf86-input-evtouch-0.8.8.orig/ev_calibrate.c 2008-11-10 21:25:32.000000000 +1000
+++ xf86-input-evtouch-0.8.8/ev_calibrate.c 2009-06-20 22:46:37.896834159 +1000
@@ -218,8 +218,7 @@
int cap_style = CapButt; /* style of the line's edje and */
int join_style = JoinBevel; /* joined lines. */

- int event_mask = ExposureMask | ButtonReleaseMask | PointerMotionMask | KeyPressMask;
-
+ int event_mask = ExposureMask | ButtonPressMask | ButtonReleaseMask | PointerMotionMask | KeyPressMask;
int depth;
int screen_num;
int screen_width;
diff -Naur xf86-input-evtouch-0.8.8.orig/evtouch.c xf86-input-evtouch-0.8.8/evtouch.c
--- xf86-input-evtouch-0.8.8.orig/evtouch.c 2008-11-11 18:47:55.000000000 +1000
+++ xf86-input-evtouch-0.8.8/evtouch.c 2009-06-20 22:52:01.551743739 +1000
@@ -29,11 +29,8 @@
#endif

#define _evdev_touch_C_
-
-#include
-#if XF86_VERSION_CURRENT >= XF86_VERSION_NUMERIC(3,9,0,0,0)
+#include
#define XFREE86_V4
-#endif

/*****************************************************************************
* Standard Headers
@@ -74,7 +71,6 @@
#include "xf86_OSproc.h"
#include "xf86Xinput.h"
#include "exevents.h"
-#include "xf86OSmouse.h"
#include "randrstr.h"

#ifndef NEED_XF86_TYPES
@@ -139,7 +135,7 @@
"Kenan Esau",
MODINFOSTRING1,
MODINFOSTRING2,
- XF86_VERSION_CURRENT,
+ XORG_VERSION_CURRENT,
0, 8, 8,
ABI_CLASS_XINPUT,
ABI_XINPUT_VERSION,
@@ -306,7 +302,7 @@
}

if (pos_changed == 1) {
-#if GET_ABI_MAJOR(ABI_XINPUT_VERSION) == 2
+#if GET_ABI_MAJOR(ABI_XINPUT_VERSION) >= 2
ConvertProc(priv->local, 0, 2,
priv->raw_x, priv->raw_y,
0, 0, 0, 0,
@@ -352,7 +348,6 @@
void EVTouchProcessRel(EVTouchPrivatePtr priv)
{
struct input_event *ev; /* packet being/just read */
- int dummy;

ev = &priv->ev;
if ( ev->code == REL_X ) {
@@ -370,7 +365,7 @@
priv->raw_y = priv->min_y;
}

-#if GET_ABI_MAJOR(ABI_XINPUT_VERSION) == 2
+#if GET_ABI_MAJOR(ABI_XINPUT_VERSION) >= 2
ConvertProc(priv->local, 0, 2,
priv->raw_x, priv->raw_y,
0, 0, 0, 0,
@@ -653,14 +648,18 @@
* Device reports motions on 2 axes in absolute coordinates.
* Axes min and max values are reported in raw coordinates.
*/
- if (InitValuatorClassDeviceStruct(dev, 2, xf86GetMotionEvents,
+ if (InitValuatorClassDeviceStruct(dev, 2,
+#if GET_ABI_MAJOR(ABI_XINPUT_VERSION) == 0
+ xf86GetMotionEvents,
+#endif
+
local->history_size, Absolute) == FALSE)
{
ErrorF ("Unable to allocate EVTouch touchscreen ValuatorClassDeviceStruct\n");
return !Success;
}

-#if GET_ABI_MAJOR(ABI_XINPUT_VERSION) == 2
+#if GET_ABI_MAJOR(ABI_XINPUT_VERSION) >= 2
xf86InitValuatorAxisStruct(dev, 0, 0, priv->screen_width,
1024,
EV_AXIS_MIN_RES /* min_res */ ,
@@ -743,19 +742,6 @@
}


-
-
-static unsigned char
-EVTouchRead(EVTouchPrivatePtr priv)
-{
- unsigned char c;
- XisbBlockDuration (priv->buffer, EV_TIMEOUT);
- c = XisbRead(priv->buffer);
- return (c);
-}
-
-
-
static Bool
EVTouchGetPacket (EVTouchPrivatePtr priv)
{