Accelerating Remote X Performance
Table of Contents
Introduction
In the 2007 NERSC survey, several users complained about poor network performance for
interactive, GUI-based applications or claimed that they did not use NERSC resources
for interactive GUI-based applications due to poor network performance. The goal of
this project was to determine and evaluate alternatives to X11 tunneling/compression
via ssh that alleviate X11 performance problems that are mainly caused by the high
network latency to remote locations.
Alternatives under consideration
The X11 protocol was designed for local area network connections. Because of this
fact, several design decisions lead to poor performance over wide area network
connections. In particular: (i) the X protocol is very verbose, requiring
comparatively large amounts of data to be sent over the network and (ii) many
operations require a "handshake" between client and server leading to long "wait"
times on high-latency links before an operation can be completed. This section gives
a brief review of technologies that we considered as alternatives to ssh X11
tunneling, which alleviates only (i) if compression is enabled. The following solutions (i)
decrease verbosity by means of compression and (ii) minimize the impact of high RTT
latency.
dxpc
The
Differential X Protocol Compressor (dxpc) is "an X protocol compressor
designed to improve the speed of X11 applications run over low-bandwidth links (such
as dial up PPP connections)" (
http://www.vigor.nu/dxpc/). It "understands"
the X protocol and "intelligently" compresses X network traffic. Dxpc is designed
to cope with low bandwidth links. It was not considered as a candidate for a solution
since it does not employ any strategies to cope with high latency links, which seem
to be the main cause for poor remote performance. It is included in this overview
since it serves as basis for (Free)NX.
(Free)NX
NX is a commercial product by NoMachine Inc. Its compression of X11 traffic
(nxproxy) is based on dxpc technology. In addition, NX sets up an environment
that makes it possible to cope with low latency links by "shortcutting" X
requests and the corresponding replies. NoMachine released the base
implementation of the server as open source and clients are freely available
from (
their web page). Based on this
source code, (
FreeNX) provides users
with scripts that simplify setup of the server on a machine. Many Linux
distributions (Fedora Core, OpenSUSE) include NX server packages based on
FreeNX.
Virtual Network Computing (VNC)
VNC shares the desktop of a machine to different clients. It uses the RFB (Remote
Frame Buffer Protocol) to control another computer remotely. Unlike X11, RFB
transmits frame buffer contents are transfered instead of individual commands to draw
graphical objects. This mitigates latency problems, as it requires less
synchronization between remote machines. VNC desktop sharing is available for a wide
range of operating systems/desktops, including X11, Windows and MacOS. The most
common way to share an X desktop via VNC is Xvnc, which starts a new X server with a
virtual display.
- RealVNC is the successor of the
original VNC implementation. It is available as commercial version and a free
version that provides fewer features. Most Linux distributions offer either
RealVNC (free version) or TightVNC out of the box.
- TightVNC is a free VNC
implementation. Aside from RealVNC it is most commonly used. TightVNC offers
improved image compression over the original RealVNC version.
- VirtualGL and TurboVNC: VirtualGL is a
means to display OpenGL output remotely. It uses graphics hardware in a remote
machine to perform the actual rendering. For remote access it can operate in a
mode where it translates GLX requests into regular X traffic (by rendering into
a buffer and sending resulting images as X pixmaps). It is possible to use
VirtualGL with FreeNX and VNC. For VNC there exists a custom client TurboVNC
(based on TightVNC), which adds an double buffer support to VNC and uses
TurboJPEG for fast, efficient JPEG compression. VirtualGL is not well suited
for current NERSC machines as they lack dedicated graphics hardware. Once NERSC
makes machines with dedicated GPUs available to users, VirtualGL and/or the
Chromium Render Server (CRRS) will most likely be part of a solution to make
this hardware available to NERSC users.
Security Considerations
(Free)NX
(Free)NX uses the X11 protocol and ssh to make a machine remotely accessible. Due to
this implementation, it should not open new security vulnerabilities. One possible
concern is that (Free)NX requires creating a new user ("nx"), which is used to
establish the X11 connection to a remote machine. The nx client
authenticates itself as this user via an ssh key. Currently available nx clients
require that this key does not have a pass phrase. If this key becomes
compromised, it will create a vulnerability. However, the nx user does not provide
access to a regular shell and ssh features, such as port forwarding, are now disabled
by default for that key. In any case, it will be necessary to distribute that key to
NERSC users in a secure way. (It may be advisable to give each user a separate ssh
key to the nx account to limit this vulnerability.)
VNC
VNC is a new service (i.e., it is not currently in use on NERSC machines) and may
create additional vulnerabilities. It does its own authorization independent of site
policy. It requires that the user chooses a new password for the sharing the desktop.
This password is transmitted and stored in an insecure way. This problem can (and
should) be mitigated by tunneling the VNC connection via ssh and making the VNC port
inaccessible from the network outside. In a way, this password is comparable to the
X11 magic cookie, which is similarly insecure.
Test Methodology
To test the various solutions it was necessary to simulate various network
conditions. Our basic setup consisted of three machines: one machine simulating the
NERSC resource/computer, one machine serving as user machine and a third machine
configured as a network bridge and is used to monitor various network bandwidth and
latency configurations. This setup is similar to the setup proposed by a study on X11
network performance (\url{http://keithp.com/~keithp/talks/usenix2003/html/net.html}).
In this setup, the alternatives (ssh, NX, VNC) were evaluated given different network
parameters (latency, bandwidth).
Network Simulation
We used the
NIST network
simulator to simulate the network. The NIST network simulator allows users
to specify delays for packages between machines, as well as limiting the
bandwidth. NIST supports more elaborate settings, such as simulating varying
latency distributions, but we did not use them in this context. After
specifying latency and bandwidth constraints, we verified network conditions
using
iperf. We did so as a
sanity check, as well measuring actual bandwidth over a TCP/IP connection,
since latency alone influences TCP/IP performance.
Expected Latency Range
We worked with the NERSC Network and Security group to obtain estimates on ping
times and available bandwidth between NERSC and the work sites of various NERSC
users. In particular, we collected detailed information about the network
connection between NERSC and the Oak Ridge National Laboratory (ORNL) as well
as the Princeton Plasma Physics Laboratory (PPPL) by asking remote users to
send us statistics obtained using the "
NERSC Web100 based Network Diagnostic Tool
(NDT)". The results suggest that round trip (RTT) latencies to UCLA, ORNL
and PPPL are approximately 10ms, 66ms and 80ms respectively.
Measuring GUI Application Performance
We developed a timer application that allows us to measure the responsiveness
of GUI-based application. This application runs on an X-based system and uses
the XTEST and XDamage extensions. XTEST supports simulating mouse button press
events, and XDamage makes it possible to receive notifications about screen
updates. Our timer application uses XTEST to simulate mouse button clicks and
keeps a time stamp of when it sent a mouse button press event. It subsequently
monitors screen updates in a specified region of the screen and waits until no
framebuffer changes occur for a user-specified time. By doing so, it makes it
possible to measure the time between a mouse button press, e.g., on a menu, and
the time of the last screen update that occurs in response to the event,
providing an objective measure of application responsiveness. The timer tool
allowed us to consider regular applications that NERSC users are likely to use
instead of having to resort to synthetic test applications. We also performed
tests to evaluate subjective user experience for operations that are difficult
to measure, e.g., moving around window on the desktop.
Other Criteria
While the reason for performing this study are complaints about X11 performance over a
high latency network, there are other considerations that needed to be taken into
account when deciding on a solution. These criteria were:
- Security - The new remote access methods open new network ports and
may add vulnerabilities to the system. These must be evaluated. Furthermore,
coordination with the Networking, Security and Services Group (NERSC-NAST) will be
necessary.
- Usability/ability to deploy - The new technology must be easily
usable by NERSC users. Furthermore, since both NX and VNC use lossy image compression,
quality is a concern.
- Licensing issues - If necessary, it must be feasible to purchase
licenses. In particular a large number of processors on NERSC machines may be of
concern.
Timing Results
We performed tests for two example sites. The UCLA test case served to measure
benefits for off-site users that are still relatively close to NERSC. The PPPL test
case, on the other hand, is close to worst-case conditions encountered by users in
the continental United states. In both test instances, we only simulated RTT
latencies (10ms for UCLA and 80ms for ORNL). Even though we did not impose any
bandwidth limits, the flow control used in TCP/IP will limit bandwidth depending on
RTT latency. We used the iperf tool to measure available bandwidth on our test
networks. The resulting measurements for UCLA are:
ghweber@hpcrd7:~> iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 4] local 10.0.0.1 port 5001 connected with 10.0.1.2 port 33694
[ 4] 0.0-10.0 sec 387 MBytes 324 Mbits/sec
ghweber@gunther3:~> iperf -c gunther1
------------------------------------------------------------
Client connecting to gunther1, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[ 3] local 10.0.1.2 port 33694 connected with 10.0.0.1 port 5001
[ 3] 0.0-10.0 sec 387 MBytes 324 Mbits/sec
\end{verbatim}
For PPPL the resulting measurements are:
\begin{verbatim}
ghweber@hpcrd7:~> iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 4] local 10.0.0.1 port 5001 connected with 10.0.1.2 port 54314
[ 4] 0.0-10.1 sec 46.8 MBytes 38.8 Mbits/sec
ghweber@gunther3:~> iperf -c gunther1
------------------------------------------------------------
Client connecting to gunther1, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[ 3] local 10.0.1.2 port 54313 connected with 10.0.0.1 port 5001
[ 3] 0.0-10.0 sec 46.4 MBytes 38.9 Mbits/sec
We tested three alternatives
- ssh X11 forwarding,
- VNC (TightVNC), and
- (Free)NX
under these network conditions using matlab as a test applications. In a sequence, we
established a connection, started matlab (without splash screen), opened an edit
window and cycled through some pulldown menus of the edit window.
Tables
1 and
2 show the result of our measurements.
Both show significant benefits of both VNC and NX over simple ssh X11 tunneling.
Obviously, these benefits are more pronounced over the high latency link, where
response times improve by an order of magnitude in most cases.
Table 1:
Test results for simulated connection to UCLA.
Action |
SSH |
VNC |
FreeNX |
Establish connection |
n/a |
≈11s |
≈16s |
Start Matlab (-nosplash) |
9.6s |
4.9s |
5s |
Open edit window |
2.9s |
1.3s |
1.2s |
Activate File menu |
0.6s |
0.1s |
0.1s |
Activate Edit menu |
0.6s |
0.1s |
0.1s |
Activate Text menu |
0.5s |
0.2s |
0.1s |
Close edit window, redraw main window |
1.5s |
0.4s |
0.3s |
Close matlab |
0.5s |
0.6s |
0.6s |
Table 2:
Test results for simulated connection to PPPL.
Action |
SSH |
VNC |
FreeNX |
Establish connection |
n/a |
≈5.7s |
≈10.2s |
Start Matlab (-nosplash) |
39.5s |
4.6s |
5.4s |
Open edit window |
14.9s |
1.3s |
1.12s |
Activate File menu |
3.7s |
0.3s |
0.2s |
Activate Edit menu |
7.6s |
0.4s |
0.2s |
Activate Text menu |
5.1s |
0.4s |
0.3s |
Close edit window |
7.3s |
1.4s |
1.8s |
Close matlab |
2.1s |
1.54s |
1.1s |
Conclusions
Based on the measurements, NX and VNC perform very similar. Currently, we have
chosen to deploy Free(NX), mainly because setting up an NX connection requires
considerably fewer steps on the remote user's side. Thus, NX is more convenient and
easier to use, making it more likely to be utilized. Even though NX clients are
available for most significant platforms (Linux, MacOS X, Windows and Solaris), VNC
clients are more widely available including Java solutions. In the long term, it may
be beneficial to offer both alternatives to NERSC users. In particular, we need to
reevaluate these solutions if graphics hardware becomes available for analytics use
at NERSC.
| |