Friday, December 29, 2006

Trust, Access Control, and Rights for Web Services

XML Key Management Specification (XKMS)

The XML Key Management Specification is built on top of and complements the XML standards for Digital Signature and Encryption. XKMS reached version 2.0 W3C working draft in April 2003.

By now, you see that Web services need end-to-end message integrity and confidentiality, which means that they need XML Digital Signature and XML Encryption. Those technologies, in turn, scale best when they use public key cryptography. Public key cryptography needs a supporting infrastructure, PKI, to handle distribution, certification, and life-cycle management (for example, revocation) of keys. PKI has proven to be very difficult and expensive to build and maintain in practice, and many failures have given it a bad reputation as an almost "failed" technology. Web services themselves provide a powerful new approach to PKI that prevents each Web service requestor and provider from having to build their own PKI: accessing a trusted PKI as a service. XKMS aims to do just that.
Origins of XKMS

XKMS specifies protocols for distributing and registering public keys suitable for use in conjunction with the XML Digital Signature standard and the XML Encryption standard. XKMS is composed of two parts:

* XML Key Information Service Specification (X-KISS)
* XML Key Registration Service Specification (X-KRSS)

X-KISS is a protocol to support the creation of a service to which an application delegates the processing of Key Information. Thus, applications needing keys for use with an XML Signature, XML Encryption, or other use of the element can handle the necessary complex key management by calling a shared service.

X-KRSS is a protocol to support the registration and management of a key pair by a key pair holder, with the intent that the key pair subsequently be usable in conjunction with the XML Key Information Service Specification or a Public Key Infrastructure such as X.509 or PKIX.
Goals of XKMS

XKMS's first goal is to support a simple client's capability to use sophisticated key management functionality. Such a simple client is not concerned with the details of the infrastructure required to support the public key management but may choose to work with X.509 certificates if it is able to manage the details. This ties back to the biggest impediment for PKI, which has been the lack of client support. This goal does not directly impact the discussion of PKI for Web services, but the second goal does.

The second goal is to provide public key management support to XML applications. In particular, it is a goal of XML key management to support the public key management requirements of XML Encryption, XML Digital Signature, and to be consistent with SAML.

One sample use of XKMS is for implementing "transaction accountability." When a Web service embeds trust in electronic transactions using digital signatures, digital receipts, and notary services based on business policies, XKMS can, when needed, transparently link to a trust Web service to affix and validate digital signatures, notary stamps, and digital receipts to XML documents.

In this scenario, XKMS represents a strong tangible benefit of XML Signature. The presence of XKMS means that use of XML Signature can be independent of PKI vendor implementations and enables Web services to offer a wider range of options for trust relationships. In particular, access to an XKMS service makes it easier to add attribute-bindings to messages than it would be to add X.509 certificate extensions that require a tight relationship with a PKI vendor.

X-KISS

The X-KISS Locate service resolves a element. It is a name resolution service. The service may resolve the element using local data or may relay the request to other servers. For example, the XKMS service might resolve a element or act as a gateway to an underlying PKI based on a non-XML syntax.


9.6 XKMS message types and their relationship to the
XKMS client and the Trust Service.

Here's a sample scenario: A Web service receives a signed document that specifies the sender's X.509v3 certificate but not the key value (which is embedded in the X.509 certificate). The Web service is not capable of processing X.509v3 certificates but can obtain the key parameters from the XKMS service by means of the Locate service. The Web service sends the element to the Locate service and requests that the and elements be returned, as shown in Listing 9.7. When it has these elements, it has the information needed to decode the XML Digital Signature it just received.


Listing 9.7 X-Kiss Request to XKMS Locate Service to Process X.509 Certificates to Obtain Key Parameters


xmlns:xenc="http://www.w3.org/2001/04/xmlenc#"
Id="I4593b8d4b6bd9ae7262560b5de1016bc"
Service="http://test.xmltrustcenter.org/XKMS"
xmlns="http://www.w3.org/2002/03/xkms#">
KeyValue



MIICAjCCAW+gAwIBAgIQlzQov
IEbLLhMa8K5MR/juzAJBgUrDgMCHQUAMBIxEDAOBgNVBAMTB1Rlc3QgQ0EwHhcNMDIwNjEzMjEzMzQ
xWhcNMzkxMjMxMjM
1OTU5WjAsMSowKAYDVQQGEyFVUyBPPUFsaWNlIENvcnAgQ049QWxpY2UgQWFyZHZhcmswgZ8wDQYJK
oZIhvcNAQEBBQADg
Y0AMIGJAoGBAMoy4c9+NoNJvJUnV8pqPByGb4FOJcU0VktbGJpO2imiQx+EJsCt27z/pVUDrexTyctC
WbeqR5a40JCQmvN
mRUfg2d81HXyA+iYPl4L6nUlHbkLjrhPPtMDSd5YHjyvnCN454+Hr0paA1MJXKuw8ZMkjGYsr4fSYpP
ELOH5PDJEBAgMBA
AGjRzBFMEMGA1UdAQQ8MDqAEEVr1g8cxzEkdMX4GAlD6TahFDASMRAwDgYDVQQDEwdUZXN0IENBghBy
sVHEiNFiiE2lxWv
mJYeSMAkGBSsOAwIdBQADgYEAKp+RKhDMIVIbooSNcoIeV/wVew1bPVkEDOUwmhAdRXUA94uRifiFfm
p9GoN08Jkurx/gF
18RFB/7oLrVY+cpzRoCipcnAnmh0hGY8FNFmhyKU1tFhVFdFXB5QUglkmkRntNkOmcb8O87xO0Xktmv
NzcJDes9PMNxrVt
ChzjaFAE=



Signature



When the Locate service receives the X.509v3 certificate from the in Listing 9.7, it extracts the key information from the certificate and constructs the elements it needs to return from the requesting service, as shown in Listing 9.8.

Listing 9.8 Response from XKMS Locate Service to Preceding Request


xmlns:xenc="http://www.w3.org/2001/04/xmlenc#"
Id="I46ee58f131435361d1e51545de10a9aa"
Service="http://test.xmltrustcenter.org/XKMS" ResultMajor="Success"
RequestId="#I4593b8d4b6bd9ae7262560b5de1016bc"
xmlns="http://www.w3.org/2002/03/xkms#">





zvbTdKsTprGAKJdgi7ulDR0eQBptLv/SJNIh3uVmPBObZFsLbqPwo5nyLOkzWlEHNbS
hPMRp1qFrAfF13L
MmeohNYfCXTHLqH1MaMOm+BhXABHB9rUKaGoOBjQPHCBtHbfMGQYjznGTpfCdTrUgq8VNlqM2Ph9XWMc
c7qbjNHw8= :Modulus>
AQAB



Signature
Encryption
Exchange



The X-KISS Validate service performs this function, and in addition, the client may obtain an assertion from the X-KISS service specifying the status of the binding between the public key and other data—for example, a name or a set of extended attributes. Furthermore, the service represents that the status of each data element returned is valid and that all are bound to the same public key. The client sends to the XKMS service a prototype containing some or all of the elements for which the status of the key binding is required. If the information in the prototype is incomplete, the XKMS service may obtain additional data required from an underlying PKI Service, as depicted in Figure 9.7. After the validity of the Key Binding has been determined, the XKMS service returns the status result to the client.

Figure 9.7 The Validate service provides key validation usually
sitting on top of a PKI at a trusted third party.

No single set of validation criteria is appropriate to every circumstance. Applications involving financial transactions are likely to require the application of very specific validation criteria that ensure certain contractual and/or regulatory policies are enforced. The Locate service provides a key discovery function that is neutral with respect to the validation criteria that the client application may apply. The Validate service provides a key discovery and validation function that produces results that are specific to a single set of validation criteria.

X-KRSS

From a Web services point of view, Locate and Validate will be the most common form of XKMS service requested. Depending on the nature of the Web service provided and the security policy in place, X-KRSS messages such as Register, Recover, Revoke, and Reissue may be processed only under a much more stringent environment.

In the registration phase, as shown in Figure 9.8, an XML application key pair holder registers its public key with a trusted infrastructure via a registration server. The public key is sent to the registration server using a digitally signed request specified by KRSS using the tag. The registration server responds with an XML formatted confirmation response using the tag, which indicates status of the registration (accepted, rejected, or pending) and a confirmation of name and attribute information registered with the public key. Except in the case of rejection, a key pair identifier is returned in the tag for subsequent referencing purposes. The registration is typically preceded by generation of the key pair in the key pair holder system.


Figure 9.8 X-KRSS key registration.

A sample X-KRSS is shown in Listing 9.9.

Listing 9.9 X-KRSS Request to XKMS Registration Service for Key Registration


xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/1999/XMLSchema"
xmlns:ds="http://www.w3.org/2000/09/xmldsig#">



Valid
mailto:Alice@cryptographer.test

mailto:Alice@cryptographer.test


2000-09-20T12:00:00
2001-09-20T12:00:00

qfarJIsfcVKLo




2PUN8HQlnhf9YI


EfdxSXAidruAszN




KeyName
KeyValue





The X-KRSS response to this request is shown in Listing 9.10.

Listing 9.10 X-KRSS Response from the XKMS Registration Service


xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/1999/XMLSchema"
xmlns:ds="http://www.w3.org/2000/09/xmldsig#">


Success


Valid
mailto:Alice@cryptographer.test



998/T2PUN8HQlnhf9YIKdMHHGM7HkJwA56UD0a1oYq7EfdxSXAidruAszNqBoOqfarJ
IsfcVKLob1hGnQ/l6xw==

AQAB


mailto:Alice@cryptographer.test



2000-09-20T12:00:00
2001-09-20T12:00:00








Revocation is handled via a similar protocol. The use of desktop (that is, file system) private key storage—as well as more broad XML client encryption applications—mandates some form of key recovery provision. Key recovery provides a way to recover a lost private key so that corporate-owned data encrypted with the lost private key is not lost forever. For historical reasons, key recovery is not supported by standardized protocols. In X-KRSS, such support is built in.

Firewalls

So now that you have a fairly secure operating system and know a few basic tricks, let’s get into using some more complex security tools. This chapter describes how to configure and run a secure open source firewall. If you already have a firewall, you may still want to read this chapter if you need a refresher or primer on how firewalls function. This will come in handy in later chapters that discuss port scanners and vulnerability scanners.

A firewall is a device that acts as the first line of first defense against any incoming attacks or misuses of your network. It can deflect or blunt many kinds of attacks and shield your internal servers and workstations from the Internet. A firewall can also prevent internal LAN machines from being accessed from outside your network. With the growing use of random scanners and automated worms and viruses, keeping your internal machines shielded from the Internet is more important than ever. A properly configured firewall will get you a long way towards being safe from outside attacks. (Protecting yourself from inside attacks is a different thing altogether and is a subject of Chapters 4 through 7.)

Chapter Overview

Concepts you will learn:

• Basic concepts of TCP/IP networking
• How firewalls operate
• The philosophy of firewall configuration
• Business processes for firewalls
• Sample firewall configurations

Tools you will use: Iptables, Turtle Firewall, and SmoothWall

It’s pretty much a given these days that firewalls are an essential part of any secure infrastructure. There are many very viable commercial alternatives available: Cisco, NetScreen, SonicWALL, and Checkpoint are just a few of the vendors making high-end, commercial firewall solutions. These products are built to handle large corporate networks and high traffic volumes

Linksys (now owned by Cisco), D-Link, and NETGEAR are some of the vendors making low-end consumer-grade firewalls. These devices generally don’t have much configurability or expandability; they basically act as a packet filter, blocking incoming TCP and UDP connections and as a NAT appliance. They are usually marketed for DSL and cable-type connections and may buckle under heavier loads.

The higher end firewalls will do just about anything you want them to do. However, that comes at a price: most of them start at several thousand dollars and go up from there. And they often require you to learn a new syntax or interface in order to configure them. Some of the newer models, like SonicWALL and NetScreen, are going to a Web-based configuration interface, but that usually comes at the expense of less depth in the configuration options.

The little known and rarely advertised secret of some commercial firewalls is that they have open source software just underneath the hood. What you are really paying for is the fancy case and the technical support line. This may be worth it for companies that need the extra support. However, if you are going to have to learn yet another interface, and if they are using the same technologies that are available to you for free, why not create your own firewall with the open source tools provided in this book and save your firm thousands of dollars? Even if you don’t want to throw out your commercial firewall, learning more about firewall basics and what happens behind the scenes will help you keep your firewall more securely configured.

Before we dive into the tools, I want to go over the basics of what a firewall does and how it works with the various network protocols to limit access to your network. Even if you are not planning to use open source software for your firewall, you can still benefit from knowing a little more about what is really going on inside that black box.

Network Architecture Basics

Before you can truly understand network security, you have to first understand network architecture. Although this book is not intended to serve as a network primer, this section is a quick review of network concepts and terms. I will be referring to these terms often and it will help you to have a basic understanding of the TCP/IP protocol. If you are already well-schooled in network topologies, then you can skip over this section and jump straight into the tools.

As you may know, every network design can be divided into seven logical parts, each of which handles a different part of the communication task. This seven-layered design is called the OSI Reference Model. It was created by the International Standards Organizations (ISO) to provide a logical model for describing network communications, and it helps vendors standardize equipment and software. Figure 3.1 shows the OSI Reference Model and gives examples of each layer.

Physical

This layer is the actual physical media that carries the data. Different types of media use different standards. For example, coaxial cable, unshielded twisted pair (UTP), and fiber optic cable each serve a different purpose: coaxial cable is used in older LAN installations as well as Internet service through cable TV networks, UTP is generally used for in-house cable runs, while fiber optic is generally used for long-haul connections that require a high load capacity.

Data Link

This layer relates to different pieces of network interface hardware on the network. It helps encode the data and put it on the physical media. It also allows devices to identify each other when trying to communicate with another node. An example of a data link layer address is your network card’s MAC address. (No, the MAC address doesn’t have anything to do with Apple computers; it’s the Medium Access Control number that uniquely identifies your computer’s card on the network.) On an Ethernet network, MAC addresses are the way your computer can be found. Corporations used many different types of data link standards in the 1970s and 80s, mostly determined by their hardware vendor. IBM used Token Ring for their PC networks and SNA for most of their bigger hardware, DEC used a different standard, and Apple used yet another. Most companies use Ethernet today because it is widespread and cheap.

Network

This layer is the first part that you really see when interacting with TCP/IP networks. The network layer allows for communications across different physical networks by using a secondary identification layer. On TCP/IP networks, this is an IP address. The IP address on your computer helps get your data routed from place to place on the network and over the Internet. This address is a unique number to identify your computer on an IP-based network. In some cases, this number is unique to a computer; no other machine on the Internet can have that address. This is the case with normal publicly routable IP addresses. On internal LANs, machines often use private IP address blocks. These have been reserved for internal use only and will not route across the Internet. These numbers may not be unique from network to network but still must be unique within each LAN. While two computers may have the same private IP address on different internal networks, they will never have the same MAC address, as it is a serial number assigned by the NIC manufacturer. There are some exceptions to this (see the sidebar Follow the MAC), but generally the MAC address will uniquely identify that computer (or at least the network interface card inside that computer).

Flamey the Tech Tip: Follow the MAC

MAC addresses can help you troubleshoot a number of network problems. Although the MAC address doesn’t identify a machine directly by name, all MAC addresses are assigned by the manufacturer and start with a specific number for each vendor. Check out www.macaddresses.com for a comprehensive list. They are also usually printed on the card itself.

By using one of the network sniffers discussed in Chapter 6, you can often track down the source of troublesome network traffic using MAC addresses. Mac addresses are usually logged by things like a Windows DHCP server or firewalls, so you can correlate MAC addresses to a specific IP address or machine name. You can also use them for forensic evidence—amateur hackers often forge IP addresses, but most don’t know how to forge their MAC address, and this can uniquely identify their PCs.

Transport

This level handles getting the data packet from point A to point B. This is the layer where the TCP and UDP protocols reside. TCP (Transmission Control Protocol) basically ensures that packets are consistently sent and received on the other end. It allows for bitlevel error correction, retransmission of lost segments, and fragmented traffic and packet reordering. UDP (User Datagram Protocol) is a lighter weight scheme used for multimedia traffic and short, low-overhead transmissions like DNS requests. It also does error detection and data multiplexing, but does not provide any facility for data reordering or ensured data arrival. This layer and the network layer are where most firewalls operate.

Session

The session layer is primarily involved with setting up a connection and then closing it down. It also sometimes does authentication to determine which parties are allowed to participate in a session. It is mostly used for specific applications higher up the model.

Presentation

This layer handles certain encoding or decoding required to present the data in a format readable by the receiving party. Some forms of encryption could be considered presentation. The distinction between application and session layers is fine and some people argue that the presentation and application layers are basically the same thing.

Application

This final level is where an application program gets the data. This can be FTP, HTTP, SMTP, or many others. At this level, some program handling the actual data inside the packet takes over. This level gives security professionals fits, because most security exploits happen here.

TCP/IP Networking

The TCP/IP network protocol was once an obscure protocol used mostly by government and educational institutions. In fact, it was invented by the military research agency, DARPA, to provide interruption-free networking. Their goal was to create a network that could withstand multiple link failures in the event of something catastrophic like a nuclear strike. Traditional data communications had always relied on a single direct connection, and if that connection was degraded or tampered with, the communications would cease. TCP/IP offered a way to “packetize” the data and let it find its own way across the network. This created the first fault-tolerant network.

However, most corporations still used the network protocols provided by their hardware manufacturers. IBM shops were usually NetBIOS or SNA; Novell LANs used a protocol called IPX/SPX; and Windows LANs used yet another standard, called NetBEUI, which was derived from the IBM NetBIOS. Although TCP/IP became common in the 1980s, it wasn’t until the rise of the Internet in the early 90s that TCP/IP began to become the standard for data communications. This brought about a fall in the prices for IP networking hardware, and made it much easier to interconnect networks as well.

TCP/IP allows communicating nodes to establish a connection and then verify when the data communications start and stop. On a TCP/IP network, data to be transmitted is chopped up into sections, called packets, and encapsulated in a series of “envelopes,” each one containing specific information for the next network layer. Each packet is stamped with a 32-bit sequence number so that even if they arrive in the wrong order, the transmission can be reassembled. As the packet crosses different parts of the network each layer is opened and interpreted, and then the remaining data is passed along according to those instructions. When the packet of data arrives at its destination, the actual data, or payload, is delivered to the application.

It sounds confusing, but here is an analogy. Think of a letter you mail to a corporation in an overnight envelope. The overnight company uses the outside envelope to route the package to the right building. When it is received, it will be opened up and the outside envelope thrown away. It might be destined for another internal mailbox, so they might put in an interoffice mail envelope and send it on. Finally it arrives at its intended recipient, who takes all the wrappers off and uses the data inside.

As you can see, the outside of our data “envelope” has the Ethernet address. This identifies the packet on the Ethernet network. Inside that layer is the network information, namely the IP address; and inside that is the transport layer, which sets up a connection and closes it down. Then there is the application layer, which is an HTTP header, telling the Web browser how to format a page. Finally comes the actual payload of packet—the content of a Web page. This illustrates the multi-layered nature of network communications.

There are several phases during a communication between two network nodes using TCP/IP (see Figure 3.2). Without going into detail about Domain Name Servers (DNS) and assuming we are using IP addresses and not host names, the first thing that happens is that the machine generates an ARP (Address Resolution Protocol) request to find the corresponding Ethernet address to the IP it is trying to communicate with. ARP converts an IP address into a MAC address on an Ethernet network.

Now that we can communicate to the machine using IP, there is a three-way communication between the machines using the TCP protocol to establish a session. A machine wishing to send data to another machine sends a SYN packet to synchronize, or initiate, the transmission. The SYN packet is basically saying, “Are you ready to send data?” If the other machine is ready to accept a connection from the first one, it sends a SYN/ACK, which means, “Acknowledged, I got your SYN packet and I’m ready.” Finally, the originating machine sends an ACK packet back, saying in effect, “Great, I’ll start sending data.” This communication is called the TCP three-way handshake. If any one of the three doesn’t occur, then the connection is never made. While the machine is sending its data, it tags the data packets with a sequence number and acknowledges any previous sequence numbers used by the host on the other end. When the data is all sent, one side sends a FIN packet to the opposite side of the link. The other side responds with a FIN/ACK, and then the other side sends a FIN, which is responded to with a final FIN/ACK to close out that TCP/IP session.

Because of the way TCP/IP controls the initiation and ending of a session, TCP/IP communications can be said to have state, which means that you can tell what part of the dialogue is happening by looking at the packets. This is a very important for firewalls, because the most common way for a firewall to block outside traffic is to disallow SYN packets from the outside to machines inside the network. This way, internal machines can communicate outside the network and initiate connections to the outside, but outside machines can never initiate a session. There are lots of other subtleties in how firewalls operate, but basically that’s how simple firewalls allow for one-way only connections for Web browsing and the like.

There are several built-in firewall applications in Linux: these are known as Iptables in kernel versions 2.4x, Ipchains in kernel versions 2.2x, and Ipfwadm in kernel version 2.0. Most Linux-based firewalls do their magic by manipulating one of these kernel-level utilities.

All three applications operate on a similar concept. Firewalls generally have two or more interfaces, and under Linux this is accomplished by having two or more network cards in the box. One interface typically connects to the internal LAN; this interface is called the trusted or private interface. Another interface is for the public (WAN) side of your firewall. On most smaller networks, the WAN interface is connected to the Internet. There also might be a third interface, called a DMZ (taken from the military term for Demilitarized Zone), which is usually for servers that need to be more exposed to the Internet so that outside users can connect to them. Each packet that tries to pass through the machine is passed through a series of filters. If it matches the filter, then some action is taken on it. This action might be to throw it out, pass it along, or masquerade (“Masq”) it with an internal private IP address. The best practice for firewall configuration is always to deny all and then selectively allow traffic that you need (see the sidebar on firewall configuration philosophy).



Firewalls can filter packets at several different levels. They can look at IP addresses and block traffic coming from certain IP addresses or networks, check the TCP header and determine its state, and at higher levels they can look at the application or TCP/UDP port number. Firewalls can be configured to drop whole categories of traffic, such as ICMP. ICMP-type packets like ping are usually rejected by firewalls because these packets are often used in network discovery and denial of service. There is no reason that someone outside your company should be pinging your network. Firewalls will sometimes allow echo replies (ping responses), though, so you can ping from inside the LAN to the outside.

Security Business Processes

At some point, preferably before you start loading software, you should document in writing a business process for your firewall(s). Not only will this be a useful tool for planning your installation and configuration, but it may also help if you have to justify hardware purchases or personnel time to your boss. Documenting your security activities will make you look more professional and emphasize the value you add to the organization, which is never a bad thing. It also makes it easier for anyone who comes after you to pick up the ball.

This plan documents the underlying processes and procedures to make sure that you get a business benefit from the technology. Installing a firewall is all well and good, but without the proper processes in place, it might not actually give the organization the security it promises. The following steps outline a business process for firewall implementation and operation.

1. Develop a network use policy. There may already be some guidelines in your employee manual on proper computer use. However, many computer use polices are intentionally vague and don’t specify which applications count as misuse. You may have to clarify this with your manager or upper management. Are things like instant messengers allowed? Do you want to follow a stringent Web and e-mail only outbound policy? Remember that it is safer to write a rule for any exceptions rather than allowing all types of activity by default. Getting the answers to these questions (hopefully in writing) is crucial before you start writing rules.


2. Map out services needed outward and inward. If you don’t already have a network map, create one now. What servers need to be contacted from the outside and on which ports? Are there users who need special ports opened up for them? (Hint: technical support staff often need FTP, Telnet, and SSH.) Do you want to set up a DMZ for public servers or forward ports to the LAN from the outside? If you have multiple network segments or lots of public servers, this could take longer than the firewall setup itself. Now is the time to find out about these special requests, not when you turn on the firewall and it takes down an important application.

3. Convert the network use policy and needed services into firewall rules. This is when you finally get to write the firewall rules. Refer to your list of allowed services out, required services in, and any exceptions, and create your firewall configuration. Be sure to use the “deny all” technique described in the sidebar to drop anything that doesn’t fit one of your rules.

4. Implement and test for functionality and security. Now you can turn on your firewall and sit back and wait for the complaints. Even if your rules conform exactly to policy, there will still be people who didn’t realize that using Kazaa to download movies was against company policy. Be ready to stand your ground when users ask for exceptions that aren’t justified. Every hole you open up on your firewall is a potential security risk.

Also, once your firewall is operating to your users’ satisfaction, make sure that it is blocking what it is supposed to be blocking. By using two tools discussed later in this book together, you can run tests against your firewall: A port scanner on the outside and a network sniffer on the inside will tell you which packets are getting through and which ones aren’t. This setup can also be useful for troubleshooting applications that are having problems with the firewall.

5. Review and test your firewall rules on a periodic basis.
Just because your firewall is working great today doesn’t mean it will be tomorrow. New threats may evolve that require new rules to be written. Rules that were supposed to be temporary, just for a project, may end up being left in your configuration. You should review your rules periodically and compare them with the current business requirements and security needs. Depending on the size and complexity of your configuration and how often it changes, this may be as infrequently as once a year for firewalls with a small rule set (20 or fewer rules), or once a month for very complex firewalls. Each review should include an actual test using the scanner/sniffer setup mentioned above using the tools in Chapters 4, 5, and 6 to verify that the rules are indeed doing what they are supposed to be.

Designing and using a business process such as this will help ensure you get a lot more out of your firewall implementation, both professionally and technically. You should also develop plans for the other technologies discussed in this book, such as vulnerability scanning and network sniffing.

A Quick Look at Cross Site Scripting

Here is one you might not have heard of: cross site scripting. With just a bit of JavaScript, a malicious attacker can use it to cause all sorts of problems. To find out more about what it is, and how to prevent your website from becoming a victim, keep reading.


Introduction

The question keeps spinning in our minds, just like a ball bouncing deeply inside the brain: is our website really secure? Surely, that’s a very tough topic to answer. But one thing is true in all cases: there are not any websites “completely” safe from attacks. Given the uncontrolled and anonymous nature of the Internet, the concept of a bulletproof website is merely a pipe dream.

More specifically, Web servers are inherently public machines, being accessible by many people around the world, and clearly exposed to several well-known attack techniques. The value of the information stored on servers varies widely, depending on what kind of sites they are hosting, but it’s always appealing to potential attackers. However, there is a lot that we can do about securing our website.

We are well aware of many attack methods which might end up exposing, modifying, or deleting sensitive data, so our site is well assured against them. Also, we have updated our software accordingly, stopped unnecessary services on the server, closed unused TCP ports, encrypted data, and the like. What else could be vulnerable? Many times, it’s not properly considered or ignored: assumptions made by developers.

Designers and programmers need to make many assumptions. Hopefully, they will document their assumptions and usually be right. Sometimes thought, developers will make poor assumptions. These might include that input data will be valid, will not include unusual characters or will be a fixed length. That brings us almost immediately to the well-known “SQL Injections,” widely documented in several articles on the Web, in conjunction with Cross Site Scripting attacks. Here is where this article comes in.

In the rest of the article, I'll cover what Cross Site Scripting is, how it works and how it can be avoided, increasing our site’s security level and, hopefully, bringing an overall improvement to our security strategy.

What is Cross Site Scripting?


To understand what Cross Site Scripting is, let’s see a usual situation, common to many sites. Let’s say we are taking some information passed in on a querystring (the string after the (?) character within a URL), with the purpose of displaying the content of a variable, for example, the visitor’s name:

http://www.yourdomain.com/welcomedir/welcomepage.php?name=John

As we can see in this simple querystring, we are passing the visitor’s name as a parameter in the URL, and then displaying it on our “welcomepage.php” page with the following PHP code:


echo ‘Welcome to our site ’ . stripslashes($_GET[‘name’]);

?>

The result of this snippet is shown below:

Welcome to our site John

This is pretty simple and straightforward. We’re displaying the content of the “name” variable, by using the $_GET superglobal PHP array, as we have done probably hundreds of times. Everything seems to be fine. Now, what’s wrong with this code? Nothing really. But let’s modify the querystring by replacing our visitor’s name passed in the URL:

http://www.yourdomain.com/welcomedir/
welcomepage.php?name=John

with something like this:

http://www.yourdomain.com/welcomedir/
welcomepage.php?name=
script language=javascript alert
(‘Hey, you are going to be hijacked!’); /script>

Do you remember the PHP code included in our “welcome.php” page? Yes, you’re correct. When we modify the querystring, the following code is executed:


echo ‘Welcome to our site ‘ .
script language=javascript alert(‘Hey, you are going
to be hijacked!’); /script
?>

The output of this code is an alert JavaScript box telling you “Hey, you are going be hijacked!” after the “Welcome to our site” phrase.

Very ugly stuff, right? That’s a simple example of the Cross Site Scripting vulnerability. This means that any pasted JavaScript code into the URL will be executed happily with no complaints at all.

Going deeper into JavaScript

Following the same concept above described, we might build a new URL for achieving more dangerous and annoying effects. It’s just a matter of including a little bit of JavaScript.

For instance:

http://www.yourdomain.com/welcomedir/welcomepage.php?
name= script language=javascript>window.location=
”http://www.evilsite.com”; /script>

It’s getting more complex now. As we can appreciate, a JavaScript redirection will take place to “www.evilsite.com”, just by including the above URL in the browser location bar. At first glance, it’s not as bad as it seems. After all, we haven’t seen anything that could significantly harm our website. But, is it really true? Let’s present a new example, which might quickly change your mind.

We’ll demonstrate how easy is to manipulate URLs and inject JavaScript into them, for malicious purposes.

For example:

http://www.yourdomain.com/welcomedir/welcomepage.php?
name= script language=javascript>setInterval
("window.open('http://www.yourdomain.com/','innerName')",100);
/script>

Now, let’s explain in detail what’s going on here. We have inserted JavaScript code to making a request for the http://www.yourdomain.com index page every 100 milliseconds. The setInterval() method is taking care of the task, but other JavaScript methods, such as setTimeout() with a recursive implementation would do the trick too. The code could either heavily overload the Web server where our site is located or generate a Denial of Service condition by denying access to other visitors requesting the same page (or other pages), and inflict noticeable damage to the server performance. On the other hand, it would be harmful to our website’s reputation, just because other users cannot get access to it. Not very good, huh?

Please note that a similar attack effect might be achieved by manipulating sockets with PHP or any other programming language, but that’s another huge subject, out of the scope of this article. Anyway, keeping your sharp eyes open to unusual levels of traffic is a must. So, don’t ever forget to take a look at your site’s logs files and use software for monitoring traffic and real time statistics.

Unfortunately, there are a huge number of ways to attack websites using Cross Site Scripting, embedding JavaScript code into the URL. From relatively innocent and harmless scripts, to risky and harmful code, we have to try to prevent or avoid them.

If this is not enough, we’ll see another common Cross Site Scripting technique: hiding JavaScript code within links.

The hidden link

Adding JavaScript code into querystrings is a quite easy stuff to get done, so the same concept is applied to regular links. This is easily deductible, since all of the previous examples presented have manipulated absolute links directly from the location bar. Thus, relative and absolute links within documents or email messages can be tampered too.

An example is useful to properly understand how this technique works:

a href=”http://www.yourdomain.com/welcomedir/
welcomepage.php?name= script language=javascript>window.location=’
http://www.evilsite.com’; /script>”>healthy food /a

If we take a deeper look at the code above listed, we can see clearly what’s going on. Within the regular link, the JavaScript code is inserted to redirect users to a completely different site. The expression seems to be an apparently innocent link, but it’s in fact hiding something else, the JavaScript embedded in the link.

We might send out this link to someone else, so our unworried recipient would click the link to find out a little more about healthy food, and instead being redirected to a different site location, getting something he or she would never expect to see.

Our site’s reputation could be seriously wounded, as we can fairly imagine, if someone is taking care of sending around our URL with the JavaScript code embedded in the link, to numerous recipients. That would result in the nasty redirecting effect previously described. And recipients wouldn’t be happy about it at all!

Having presented the most commonly used Cross Site Scripting techniques, we need to tackle a proper solution to avoid their ugly effects and prevent ourselves from becoming victims of them.

Let’s see how the problem can be solved.

Preventing Cross Site Scripting

First off, we need to follow simple and straight rules, applicable to common scenarios, where user input is always involved.

Always, all the time, and constantly (pick your term), check to ensure what’s coming from POST and GET requests. However obvious, you should never pass by these steps.

If a specific and particular type of data is expected, check to ensure that it’s a really valid type and that its of the expected length. Whatever programming language you’re using will give you the possibility and the power to do that easily.

Whenever possible, use client-side validation for adding extra functionality to user input checking. Please note that JavaScript validation cannot be used on its own for checking data validity, but it may help to discourage some evil-minded visitors from entering malicious data while providing useful assistance to other well-intended users.

Remove conflicting characters from user input. Search for < and > characters and make sure they're quickly removed. Single and double quotes must be escaped properly too. Many professional websites fail when dealing with character escaping. I hope you won’t.

We might go on endlessly, with numerous tips about validating user data, but you can get a lot more from just checking some other useful tutorials and articles. For the sake of this article, we’ll show an example to prevent Cross Site Scripting using PHP.

Coding for our safety

Let’s define a simple function to prevent the querysting from being tampered with external code. The function “validateQueryString()” is the following:


function validateQueryString ( $queryString , $min=1,
$max=32 ) {
if ( !preg_match ( "/^([a-zA-Z0-9]{".$min.",".$max."}=[a-zA-Z0-9]{".$min.",".$max."}&?)
+$/", $queryString ) ) {
return false;
}
return true;
}

?>

Once we have defined this function, we call it this way:


$queryString = $_SERVER[‘QUERY_STRING’];
if ( !validateQueryString ( $queryString ) ) {
header( ‘Location:errorpage.php’ );
}
else {
echo ‘Welcome to our site!’;
}

?>

Let’s break down the code to see it in detail.

The function performs pattern matching to the querystring passed as a parameter, checking to see if it matches the standard format of a querystring, including GET variable names that only contain the numbers 0-9 and valid letters either in lowercase or uppercase. Any other characters will be considered as invalid. Also, we have specified as a default value that variables can be from 1 to 32 characters long. If matches are not found, the function returns false. Otherwise, it will return true.

Next, we have performed validation on the querystring by calling the function. If it returns false -- that is, the querystring contains invalid characters -- the user will be taken to an error page, or whatever you like to do. If the function returns true, we just display a welcome message.

Of course, most of the time, we really know what variables to expect, so our validation function can be significantly simplified.

Given the previous URL,

http://www.yourdomain.com/welcomedir/
welcomepage.php?name=John

where the “name” variable is expected, we might write the new “validateAlphanum()” function:


function validateAlphanum( $value , $min = 1 , $max =
32 ) {
if ( !preg_match( "/^[a-zA-Z0-9]{".$min.",".$max."}
$/", $value ) ) {
return false;
}
return true;
}

?>

and finally validate the value like this:


$name = $_GET[‘name’];
if ( !validateAlphanum ( $name ) ) {
header( ‘Location:errorpage.php’ );
}
else {
echo ‘Welcome to our site!’;
}
?>

The concept is the same as explained above. The only noticeable difference is that we’re taking in the “name” variable as the parameter for the “validateAlphanum()” function and checking if it contains only the allowed characters 0-9, a-z and A-Z. Anything else will be considered an invalid input.

If you’re a strong advocate of object oriented programming, as I am, we might easily include this function as a new method for an object that performs user data validation. Something similar to this:


$name = $_GET[‘name’];
// get variable value
$dv = &new dataValidator();
// instantiate new data
validator object
if ( !$dv->validateAlphanum( $name ) ) {
// execute validation method
header( ‘Location:errorpage.php’ );
}
else {
echo ‘Welcome to our site!’;
}

?>


Pretty simple, isn’t it?

In order to avoid Cross Site Scripting, several approaches can be taken, whether procedural or object-oriented programming is your personal taste.

In both cases, we’ve developed specific functions to validate querystrings and avoid tampered or unexpected user input data, demonstrating that Cross Site Scripting can be prevented easily with some help coming from our favorite server-side language.

Conclusion

As usually, dealing with user input data is a very sensitive issue, and Cross Site Scripting falls under this category. It is a serious problem that can be avoided with some simple validation techniques, as we have seen through this article.

Building up robust applications that won’t make poor assumptions about visitor’s input is definitely the correct way to prevent Cross Site Scripting attacks and other harmful techniques. Client environments must always be considered as a pretty unsafe and unknown territory. So, for the sake of your website’s sanity and yours, keep your eyes open.

What's new in Mobile and Enterprise Access

About This Webcast:

Abtract: Accelerate your business with a universal managed client for SOA. Help your company be more responsive and your employees more productive by giving your employee secure access to critical business applications anytime, anywhere -- online, or offline. Join this webcast to learn about recent enhancements to IBM mobile and enterprise access products.

SOA based Lotus Expeditor now has support for composite apps (aggregation of applications and components at the glass), integration, disconnected access (for offline portal, offline forms and other applications), server management of clients, mobile access, and rich client and browser based access. And for your mobile workers, Lotus Mobile Connect provides a secure mobile VPN and seamless roaming.

Additional Methods for Using SQLite with PHP 5

Welcome to the concluding part of the series “Using SQLite with PHP 5.” As you’ll possibly know, PHP 5 comes equipped with a fully-featured RDBMS called SQLite that definitely can make your life as a PHP developer much easier. It's particularly helpful if you want to get rid of MySQL at least for a time while maintaining the data layer of your application completely isolated from the business logic.

Over the course of the second tutorial, I explained several methods that come packaged with the SQLite library and perform different tasks, such as counting the number of rows and fields contained within a specific result set, fetching one row at a time, working with unbuffered queries, and so forth.

If you’ve read the two previous articles that belong to this series (as I suppose you have), then I’m sure that you realize the great capabilities offered by SQLite. It's especially useful in those cases where you need to work with a decent RDBMS but don't need to appeal directly to the features offered by the popular MySQL server.

Well, at this point you may be thinking that you’ve learned everything about the cool methods included with SQLite, since the material that I provided you during the previous articles has certainly been abundant.

However, if you think that way, I’m afraid that you’re wrong. SQLite has some other methods that can be useful for performing all sort of clever tasks, including the definition of custom functions, finding the IDs of inserted rows, the creation of memory-based databases, and so on.

As you can see, the list of additional features offered by SQLite is really impressive. Therefore, in this last article of the series, I’ll be taking an in-depth look at them. This will complete our analysis of this excellent RDBMS integrated with PHP 5.

Are you ready to go over the last miles of this learning journey? Fine, let’s get started!

Using the seek() and lastInsertRowid() methods

Moving back and forward across a specified result set is a task that can be performed with minor difficulties when using SQLite, since the library has been equipped with the intuitive “seek()” method, which does exactly this.

With reference to this method in particular, below I developed a simple example that shows how it works. Look at the corresponding code listing, please:

// example using the 'seek()' method

// create new database using the OOP approximation

$db=new SQLiteDatabase("db.sqlite");

// create table 'USERS' and insert sample data

$db->query("BEGIN;

CREATE TABLE users (id INTEGER(4) UNSIGNED PRIMARY KEY,
name CHAR(255), email CHAR(255));

INSERT INTO users (id,name,email) VALUES
(NULL,'User1','user1@domain.com');

INSERT INTO users (id,name,email) VALUES
(NULL,'User2','user2@domain.com');

INSERT INTO users (id,name,email) VALUES
(NULL,'User3','user3@domain.com');

COMMIT;");

// fetch rows from the 'USERS' database table

$result=$db->query("SELECT * FROM users");

// loop over rows of database table

while($row=$result->fetch(SQLITE_ASSOC)){

// display row

echo $row['id'].' '.$row['name'].' '.$row['email'].'
';

}

// move pointer to second row

$result->seek(1);

while($row=$result->fetch(SQLITE_ASSOC)){

// display row

echo $row['id'].' '.$row['name'].' '.$row['email'].'
';

}

/*

displays the following

2 User2 user2@domain.com

3 User3 user3@domain.com

*/

As you can see, the snippet listed above shows a simple yet effective implementation of the referenced “seek()” method. First, the script obtains a result set via the respective “query()” method, and then it moves the pointer to the first row. Finally, after doing this, the remaining records are displayed on the browser. Quite intuitive, right?

Okay, now that you hopefully understand how the previous methods do their thing, take a look at the following one, which determines the ID of the last inserted row. One possible usage of this method is demonstrated by the example below:

// example using the 'lastInsertRowid()' method

// create new database using the OOP approximation

$db=new SQLiteDatabase("db.sqlite");

// insert new row into 'USERS' database table

$db->query("INSERT INTO users (id,name,email) VALUES
(NULL,'User4','user1@domain.com')");

echo 'ID of last inserted row is '.$db->lastInsertRowid();

/*

// displays the following

ID of last inserted row is 4

*/

As shown above, the “lastInsertRowid()” method is extremely useful for doing what it clearly suggests: finding the ID of the last inserted row. Indeed, if you’re anything like me and work intensively with DML statements, you’ll find the previous method really handy.

So far, the couple of methods that I covered are pretty straightforward, since they're very similar to some of the MySQL-related PHP functions that you’ve used probably hundreds of times.

However, there’s still more valuable material to review here concerning SQLite's capabilities. Therefore, in the next few lines I’ll explain two more methods. The first one can be used for running queries instead of using the previous “query()” method. The second one is handy for counting the number of rows affected after performing a DML operation.

Using the changes() and queryExec() methods

As I mentioned at the end of the previous section, I’m going to show you a couple of additional methods bundled with the SQLite library that can be valuable. They work well for those situations where you want to use an alternative way to run queries, and for determining the number of rows affected after performing a DML statement.

The first method that I’ll teach you is “queryExec().” It consists of a simple replacement of the “query()” method that you learned before. Here’s how to use it:

// example using the 'queryExec()' method

// create new database using the OOP approximation

$db=new SQLiteDatabase("db.sqlite");

$query="INSERT INTO users (id,name,email) VALUES(NULL,'John
Doe','john@domain.com')";

if(!$db->queryExec($query)){

trigger_error('Error performing
query'.$query,E_USER_ERROR);

}

If you take some time and examine the above short example, you’ll understand why I said the “queryExec()” method can be used as an alternative to the previously reviewed “query().” In this case, the example speaks for itself, therefore I suggest you pay attention to the following code sample. It is much more useful, since it illustrates a basic application of the brand new “changes()” method.

The script listed below shows precisely how you can use this method to calculate the number of affected row after running a DML statement:

// example using the 'changes()' method

// create new database using the OOP approximation

$db=new SQLiteDatabase("db.sqlite");

// insert new row into 'USERS' database table

$db->query("INSERT INTO users (id,name,email) VALUES
(NULL,'User4','user1@domain.com')");

echo 'Number of rows modified after the insertion '.$db->changes();

/*

// displays the following

Number of rows modified after the insertion 1

As you can see in the above example, the “changes()” method can be really helpful if you want to know how many rows were affected after inserting, updating or deleting records of a particular database. Of course, this method is closely similar to the PHP “mysql_affected_rows()” function, therefore you shouldn’t have too many problems understanding how it works.

All right, at this stage I believe that you’ve been provided with a neat set of SQLite methods which can be used for tackling different tasks. However, we’ve not come to the end of the tutorial yet, since there are a few more methods that remain uncovered.

Speaking of that, in the following section, I’ll teach you how to use iterators to traverse different result sets, and how to define custom functions with SQLite as well.

Using the create Function() method


As I explained in the previous section, the SQL library has been provided with the ability to work with iterators to traverse a specific result set by using only a typical “foreach” language construct. Certainly, you’ll have to agree with me that this feature is really handy, since there’s no need to write custom code for iterating over data sets.

To learn more about the use of iterators with SQLite, please have a look at the following example, which shows a simple implementation of this neat concept:

// example of 'SQLite' iterators

// create new database using the OOP approximation

$db=new SQLiteDatabase("db.sqlite");

// fetch rows from the 'USERS' database table

$result=$db->unbufferedQuery("SELECT * FROM users");

// use 'foreach' loop to traverse result set

foreach($result as $row){

echo 'ID: '.$row['id'].' Name :'.$row['name'].' Email :'.$row
['email'].'
';

}

/* displays the following:

ID: 1 Name :User1 Email :user1@domain.com

ID: 2 Name :User2 Email :user2@domain.com

ID: 3 Name :User3 Email :user3@domain.com

*/

As you can see, the above code snippet demonstrates how a specified result set can be traversed by using a simple iterator. In this case, the script first obtains the mentioned data set via the “unbufferedQuery()” method you learned before and finally uses a common “foreach” loop to traverse the data structure in question. Simple and efficient, isn’t it?

Now that you hopefully grasped the concept that stands behind using iterators with SQLite, it’s time to look at another useful method which I’m certain you’ll find very handy. In this case I’m talking about the “createFunction()” method. As the name clearly suggests, it's really helpful for creating user-defined functions that can be tied to a particular result set or as part of a WHERE clause.

With reference to this excellent capability, below I coded a basic example of how to use a custom function with SQLite. Take a look a the corresponding code sample:

// example of custom functions

// create custom function

function getRandomID($id){

return rand($id,5);

}

// create new database using the OOP approximation

$db=new SQLiteDatabase("db.sqlite");

$db->createFunction('getRandomID','getRandomID',1);

// fetch rows from the 'USERS' database table

$result=$db->query("SELECT * FROM users WHERE id==getRandomId(2)");

foreach($result as $row){

echo 'ID: '.$row['id'].' Name :'.$row['name'].'
Email :'.$row['email'].'
';

}

/*

displays the following

ID: 2 Name :User2 Email :user2@domain.com

*/

As you’ll realize, the above example begins creating the custom “getRandom()” function, which obviously returns a random integer between 1 and 5. After this function has been created, it’s used as part of the corresponding SELECT statement to fetch a random row from the respective database table.

Of course, this is only a basic application of using custom functions with SQLite, which means that you can experiment by defining your own, certainly more useful functions.

So far, I covered the most important methods that come bundled with the SQLite library. But I must say I’m not finished yet, since I’d like to teach you another cool feature included with this tight RDBMS.

Remember that in the beginning of this series I mentioned the capability offered by SQLite for working with memory-based databases? I hope you do, because in the last section of this article, I’ll show you how to implement this characteristic in your own database-driven PHP applications.

Creating databases in server memory

The last SQLite feature that I plan to cover here concerns specifically the creation of databases in server memory, instead of using the conventional file system. As you can imagine, this type of database can be used (among other situations) in those cases where you need to have at your disposal a fully-structured database relational system, but your data will be rather temporary, at least during the execution of your application.

That being said, defining a memory-based database with SQLite is reduced to code something as simple as this:

// example using memory-based database

// create a new memory-based database

$db = new SQLiteDatabase(":memory:");

// create table 'USERS' and insert some data

$db->query("BEGIN;

CREATE TABLE users (id INTEGER PRIMARY KEY, name VARCHAR
(255),email VARCHAR(255));

INSERT INTO users (id,name,email) VALUES
(NULL,'User1','user1@domain.com');

INSERT INTO users (id,name,email) VALUES
(NULL,'User2','user2@domain.com');

COMMIT;");

// display number of affected rows after the insertion

echo $db->changes().' rows affected by the insertion
';

// display ID of last inserted row

echo "ID of last inserted row is: ".$db->lastInsertRowid();

/*

displays the following

2 rows affected by the insertion

ID of last inserted row is: 2

*/

As shown above, a new database has been created in memory by simply specifying the “:memory” argument for the corresponding SQLite constructor. After this process has been performed, I defined a “USERS” table, in addition to inserting some trivial data, and finally displayed the ID that corresponds to the last inserted row.

As I always suggest, try creating different memory-based databases and watch what happens in each case. The process is truly educational.

Final thoughts

We’ve come to the end of this series. In these three consecutive tutorials, I took an in-depth look at the most relevant methods that come with the SQLite RDBMS, which has been included with PHP 5.

As you learned here, if your database-driven application doesn’t require all the features offered by MySQL, or another RDBMS, then this tight yet powerful library is worth considering.

Game Programming using SDL: Getting Started

Game programmers using OpenGL have often been forced to make a choice between using a library that is platform independent but doesn't use all the available resources, or powerful but platform dependent. Simple Directmedia Layer (SML) offers a third way. This article will give you a taste of its capabilities.

Game programming has come a long way since early Linux and Windows days. The time is gone when games were limited to Windows or to an extended Mac. Today portability is in the forefront, even in the gaming segment. The birth of OpenGL was the first step in this regard. But OpenGL addressed only the rendering aspect of game programming. The major part, that is communicating with varied input devices, was left to the operating system. That is the reason for the existence of various extensions to OpenGL, including GLUT (platform independent), MESA (OpenGL extension for *nix systems) and WOGL (OpenGL extension for Windows).

Each of these has its own pros and cons. If a library is OS independent, then it is limited in utilization of all the available resources. If it is able to harness the power of the underlying system, then such a library is platform dependent. Apart from portability issues, all the existing libraries left the task of developing the gaming infrastructure on the shoulders of the developer. It was during such times of extreme choices that SDL came into picture.

SDL (Simple Directmedia Layer) is a library "by the game programmers for the game programmers." Hence it doesn't try to achieve the "unachievable" by starting from scratch. Instead it is built upon the existing libraries for each OS, i.e. it uses DirectX for Windows and XWindows APIs for *nix systems. Additionally, SDL provides for all the infrastructure needs of a varied range of games.

In this discussion, I will focus on setting up SDL and accessing one of its many infrastructure facilities -- loading a sprite. In the first section I will enumerate the infrastructure services. The second section will focus on the initializing the video to achieve the best resolution. In the third section, I will discuss how to load a bitmap using SDL APIs. The third section will also detail the real world implementation of using an SDL API for sprite loading.

SDL: The Services Provided

The implementation works in such a way that it never gets in the way of a programmer's code, as is evident from the services provided by SDL. In a nutshell one can say that it follows the philosophy of SMILE (Simple Makes It a Lot Easier) which is evident from the following services provided by it:

1. Initialization and Shutdown
2. Input processing
3. Timers
4. Sound effects
5. Graphics manipulation
6. Network integration
7. Threading requirements

Of these services, the first five are the basis of any game. SDL makes dealing with each of them easier. Let's see how.

1. Initialization and Shutdown:

Whenever a game starts, it must perform initialization routines including memory allocation, resource acquisition, loading any required data from the disk, and so forth. To perform these routines, the programmer has to query the underlying OS to know the boundaries set by it. To achieve this end some code must be written, and code must be written again to use the result of the query. SDL abstracts this with a single function: SDL_Init().

2. Input Processing:

In a gaming environment the input can come from the keyboard, joystick, mouse and so on. The processing model provided by SDL is event based. Anyone who has worked in VB, Delphi or Xlib (or any of its variants) will feel at home with SDL's event model. The base of this model is the SDL_WaitEvent() method that takes SDL_Event as a reference.

3. Timers:

Without timers it is nearly impossible to imagine any challenging game. If one goes by standard methods, one would have to rely on the timers provided by the platform. But with SDL, this is a thing of past. The Time and Timer APIs provided by it are lean, mean and clean in a platform and OS independent way. SDL_getTicks() is the core of the SDL Timer API.

4. Sound Effects:

As with other functionalities provided by SDL, the functionalities related to sound are provided with minimum hassles. The sound support as a core sub-system is minimal in nature, adhering to the keep-it-lean philosophy of SDL. But there are other libraries that provide extended capabilities around SDL's APIs.

5. Graphics Manipulation:

With SDL one has the option of working at the raw pixel level or at a higher level using OpenGL. Since OpenGL is available for every platform and it can render both 2D and 3D graphics in hardware accelerated mode, it is better to use OpenGL in conjunction with SDL.

6. Networking Requirements:

Like other functionalities, networking is also important in the current genre of games. Understanding this importance, the developers of SDL provided an APIs that does the ground-level work to set up the network connections and manage them, thus making networked multiplayer games less of an enigma.

7. Threading Requirements:

The pthreads library provided by POSIX is a platform independent way of working with threads. But the API works at a low level, which can be confusing. To make threading simpler, SDL provides all the required functionalities in a high-level manner.

In essence, SDL provides for all gaming requirements in a simple and portable way. Now that the introduction to the functionalities is out of our way, we can actually see how the theory works by looking at how handling the video subsystem works.

Thursday, December 28, 2006

TFTP and Error Correction

It's a curious paradox that digital networks, bred of high binary certainty, can be so capricious. But they are, at best, barely domesticated animals.

All the same, we've built our civilization on their restive backs--a civilization dependent on perfect information. Take a thousand bits, change one of them: your password doesn't work, a train switches to the wrong track, your bank account has $65,536 extra in it, the space station's orbit-keeping software malfunctions. Nature abhors perfection almost as much as a vacuum; there's nothing it loves to do more than flip bits. But somehow, despite nature's puckish tendencies, your password works, the train keeps rolling, you're still broke, and the ISS is still up there. How?

Long ago, scientists and engineers learned that the best way to make sure something works is to assume that it will fail. A lot. In fact, the most wildly successful communication protocols, such as Ethernet and Internet Protocol, not only accept failure as inevitable, but seem almost to revel in it. If you read their specs you'll notice that most of their considerable cleverness is given over to how to survive when everything blow ups. In communication circles this is called error correction.

TFTP: Acknowledging the Possibility of Data Misadventure

Which brings us to my favorite protocol: the Trivial File Transfer Protocol. TFTP's design is revealing the same way that a car's airbags are. As an airbag testifies to the probability and violent nature of a crash, so TFTP's design speaks to frequent and catastrophic data misadventure. It's trivialness of purpose (to move one file from one computer to another--nothing more, nothing less) makes it something like a 1950's truck engine: crack open the hood and it's still simple enough to understand in an afternoon. And once you understand it, most other protocols are just TFTP with dual overhead cams and a swish fuel injection system.

It's revealing then to find out that TFTP is almost entirely composed of error correction. And instructive to note that it uses only three basic techniques in the process: redundancy, acknowledgement, and timeouts. These are the pistons and carburetors of error correction, basics worth understanding because they crop up everywhere, from P2P to 802.11g.

We'll look at each one in turn, but first a glance at what TFTP has to work with.

Usually Drops Packets

To understand the nature of TFTP, we must first look at its dependency on another protocol, the User Datagram Protocol (UDP).

UDP's purpose in life is to carry a datagram packet from one computer to another over a network. A datagram is a chunk of data with an address attached to it; imagine it as postcard. And like postcards, datagrams are typically not big enough to contain anything serious--so many must be sent, each with a small part of the whole (usually less than a kilobyte). In other words, to transfer a whole file, TFTP must first break it up into datagrams, which it then gives to UDP to deliver.

The trouble with UDP is that it's only as reliable as the network that it runs on, which is to say not at all. Datagrams get lost or misrouted, they usually arrive out of order, they might even arrive twice. Think of Charlie Chaplin delivering postcards.

For example, I send a message with one letter per datagram: HELLO. The receiver gets LHHEO. Not only is this possible, it's likely.

So the relationship between TFTP and UDP is this: TFTP needs UDP to send its packets across the network, UDP needs TFTP to unravel the mess that it creates.

Now let's look at the first tool that TFTP uses to accomplish that: redundancy.

Superfluity, Redundancy, and Repetitiousness

Imagine for a moment a fictitious colonial scene: the ambassador from Lemuria enters the court of King Ed of Florin and says hello. This takes ten minutes. A courtly greeting is a dissertation on genealogy, titles, and treaties between the two nations. During which, Ed is referred to as (at the very least) "the sovereign king", or "his royal highness".

Have a look at those two phrases, you'll notice that they repeat themselves. Sovereign means king; the two words can be used interchangeably. Likewise for "royal" and "highness".

Let's say that His Royal Highness King Ed (Sovereign of Florin) is kind of miffed at Lemuria, and wants to catch the Ambassador on any breach of etiquette so he can feign grave insult and send a couple of Florin warships out that way. The lives of many Lemurians rest on the Ambassador's flawless smarminess.

And let's suppose that someone coughs right in the middle of the Ambassador's address, obscuring the word "sovereign"... Lemuria is still safe, because the "king" part still got through.

Better yet than plain redundancy is what you could call synonymous redundancy. The purest form of repetition would be to say, "his royal royalness". But imagine that the Ambassador has a lisp, and can't say his R's. His Woyal Highness still can't take offence, because though he may not have been formally recognized as Royal, he's still a Highness.

Redundancy is probably the single most vital aspect of reliable communication. So how does TFTP use it?

The first way is actually a function of UDP. A number, called a Cyclic Redundancy Check (CRC), is attached to every datagram packet. This is a numeric summary of the data contained in the packet.

As an analogy, let's take our greeting, and for every letter we also tag on its position in the alphabet (A is 1, B is 2, and so forth):

H8 E5 L12 L12 O15

The one thing that UDP does do reliably is to make sure that a datagram hasn't been altered during the transfer. If UDP sees a packet with a mismatch, like "H4", it would know that either the letter got changed or the number did. In either case it knows that something's up and throws the packet out.

The second way that TFTP uses redundancy is via explicit ordering.

It's something that we usually don't think about, but in any word there are actually two kinds of information. The first is the character of the alphabet, the second is the order in which they appear. LHLOE doesn't mean the same thing as HELLO because the information implicit in the sequence of the characters has been altered.

We could explicitly note the order of the letters, but writing "1H2E3L4L5O" on paper isn't useful (besides being hard to read). You already know that E is the second letter because it's right after the H.

But you'll recall that UDP isn't as reliable as paper for keeping things straight. So, we add in ordering information. Our once simple hail (with redundant character and ordering information) is now:

1H8 2E5 3L12 4L12 5O15.

Which looks like a lot of superfluous writing. However, when a computer receives this:

3L12 1H8 4R5 1H8 2E5 5O15

it can make some sense of it. Let's sort it out:

Throwing out obviously bad packets (R is not the fourth letter of the alphabet), duplicates (too many H's), and rearranging things: we get HEL_O. Not quite what we were expecting, but it's a whole lot better than LHRHEO.

So, what do we do about that missing L? That's where acknowledgement comes in.
Roger That, Houston

Another familiar protocol is the preflight check that pilots do before take-off--vital functions are read out, checked, and acknowledged: "Ailerons? Check. Rudder? Check."

This is known as lockstep acknowledgement, which is slow but has a number of advantages.

To begin with, the two parties never get out of sync. If the co-pilot didn't wait for the "check" before asking the next question, he might end up going too fast--question would pile on question until the pilot became overwhelmed. In computer parlance, keeping things at the right speed is called flow control. There are many schemes for doing this, but lockstep is one of the simplest.

Another significant benefit to lockstep is that there's only one thing happening at any given moment. In the case of a preflight check, this is so the pilot can focus entirely on checking for malfunction.

If the co-pilot mashed a couple of checks together: "Aerlons, rudder, elevators?", the pilot would have to remember three things at once. Furthermore, if he responded "Check, fail, check" the co-pilot would have to accurately recall what the second check was.

TFTP uses lockstep acknowledgement as both flow control and a means of error correction. A typical exchange would look like this:

"Here's the first letter."
"OK, got the first letter."
"Here's the second letter."
"OK, got the second letter."

The sender awaits an acknowledgement, and if it doesn't receive one it simply resends the packet until it does. But how does the sender decide if the packet is really lost? This brings us to timeouts.

Hello... Hello?

The essence of a timeout is: when all else fails, wait a bit, and then try again. It's a crude method, but it allows recovery from a wide range of failures. Imagine a conversation at the top of a lighthouse in a gale--a man opens the door and steps outside...

Man1: "Windy day!"
Man2: "What?"
Man1: "Huh? I didn't hear you!"
Man2: "What?"

Both parties give up. The man waits for a second and tries again:

Man1: "Windy day!"
Man2: "Ohh! Yeah."

The Internet is a very windy place--electronic counterparts of that conversation happen all the time. Timeouts are a last bulwark against total confusion.

However, timeouts have drawbacks, which is why they are often used only as safety nets. There can be a lot of time wasted between the message being lost and the clock running out, or the receiver might give up too soon.

Humans have an intuitive sense of time: the term "give me a sec" can mean anything from five seconds to five minutes depending on the context. Computers, on the contrary, have lousy intuition--they need to be told exactly how long to wait, down to the microsecond. So a human has to decide in advance how long "enough" should be.

As you can imagine this isn't an ideal solution. It's telling that the TFTP specification talks a great deal about timeouts, but never tells you how long they should be. Ultimately, the coder simply has to guess.

Of course, that's still better than the alternative, which is to wait forever.

Send in the Pocket Protectors!

In 1948, Claude Shannon published a paper called "A Mathematical Theory of Communication". It sparked off an entire field of science devoted to studying what happens to a signal in the midst of noise.

The three basic building blocks which we've looked at--redundancy, acknowledgement, and timeouts--seemingly simple, are actually real head-scratchers for these scientists. For instance: there are countless ways to put redundancy into a signal, but which is the best, and how much is needed over a static-sounding phone line? How about a fibre optic strand? Or a cell phone in a subway? Acknowledgement and flow control mechanisms range from the Neanderthal (like TFTP's) to the subtly nuanced. But like Zeno's paradox, we can't ever get all the way to either 100 percent guaranteed correct communication or complete transfer efficiency. Both can only be approached asymptotically.

And the stakes are high--depending on the circumstances, error correction as a percentage of transfer cost can reach up into the double digits. On your LAN it's not much of a big deal, but imagine sending program code to Mars probes, or bits to and from a 15,000 RPM hard drive, or data between thousands of p2p nodes. Shaving off a fraction of a percent of overhead is a big deal.

Billions of dollars are spent each year on cramming as much accurate information down a wire or through the air as possible and still resisting nature's idle meddling. Some of the best minds of our time have devoted their lives to it.

And to think, all this, just to say HELLO.

Wednesday, December 27, 2006

Linux code source could still be at issue for enterprises

Analyst and consulting firm Gartner responded to the recent move by Linux creator Linus Torvalds and the Open Source Development Lab to formalize a process for tracking the source of Linux source code contributions.

In a bulletin to its clients, Gartner said that the move will help bring some order and present more of a formal process to open source development, but added that enterprises still face risks when bringing open source software onto their production networks.

Last week, Linux Torvalds and the OSDL announced a plan where contributors to the Linux source code tree would have to register to have their code reviewed and included in the kernel.

Gartner says any process put in place won't be an automatic protection against future claims by organizations or individuals saying that Linux infringed upon their intellectual property. Part of the problem could come from the fact that contributions by developers who are recognized by the OSDL and part of the formal development process may still contain patent-infringing code from other sources that are bundled into the larger contribution.

Another issue, according to Gartner, is that the OSDL's development process would not cover older versions of the Linux kernel. Also, enterprises interested in open source should realize that the OSDL's process for code verification only applies to Linux - other open source packages are not yet covered.

Gartner says the best bet for enterprises worried about patent infringements around open source is to work with suppliers and vendors that offer indemnification to customers to protect against any future claims of patent or copyright infringement in Linux by other companies or individuals.

Linux desktops have internal role at Cisco

Manning, an IT manager at Cisco Systems who supports the vendor's internal network, is behind a Linux push inside the company. The firm has already converted more than 2,000 of its engineers to Linux desktops, with plans to move many laptop users to the platform over the next few years. Manning says the driver for Linux on the desktop is not cost savings, but easier support.

"On the desktop, you're not going to save that much money by replacing Windows with Linux," says Manning, who is also chairman of the Open Source Development Lab's (OSDL) Desktop Linux Steering Committee. The OSDL is a vendor-neutral consortium that outlines common criteria for how Linux should operate in different environments. Manning spoke at the LinuxWorld conference this week on the issues of moving from Windows to Linux desktops.

Factors that even out the Linux/Windows desktop costs include retraining employees, installing applications that support Windows applications on Linux, and support subscription fees from Linux vendors such as Red Hat, which are necessary for software updates and patches, Manning says.

The advantage of Linux on the desktop is in the ease of administration, provided by some of the built-in tools and properties of Linux. Such tools include Secure Shell (SSH), which can allow a remote administrator to easily access and trouble shoot a desktop. Also, the ability to hide and partition underlying system files and OS underpinnings from users on Linux is helpful.

"You don't get people going into their registry or other areas of Windows and tweaking things," Manning says.

Manning estimates that it takes a company approximately one desktop administrator to support 40 Windows PCs, while one administrator can support between 200 and 400 Linux desktops.

Manning is skeptical of the notion that Linux is the cure for buggy Windows desktops, which are vulnerable to attack.

"I'm hesitant to say that Linux is more resistant to viruses and network attacks," he says. "Windows has 95% of the market, and it gets slammed because it's so prevalent. When Linux has 95 percent or the market, it will be interesting to see how much it gets attacked."

As for how far Linux will spread at Cisco, Manning is optimistic yet realistic. He says support for the technology is growing internally; even his boss, Cisco CIO Brad Boston, has a Linux-based desktop.

"We're not going to get everyone at Cisco onto Linux," Manning says. " I would be happy if we could get 70 percent of the company."

Linux on cell phones moves up the stack

Those who follow the development of Linux as an operating system for running mobile phones, voice-enabled PDAs and other communications gadgets should keep an eye on LiPS.

The Linux Phone Standardization Forum (LiPS), which will debut later this month, is a consortium of cell phone and handheld vendors including France Telecom, Motorola, PalmSource and Trolltech. The group's goal is to promote the use of Linux as a platform for running intelligent phone-like devices.

But unlike the Open Source Development Lab's Mobile Linux (OSDL) project, which is concerned about the nuts-and-bolts issues of running Linux on small form-factor devices with mobile processors, the focus of LiPS will be on the application layer of Linux mobile phone devices. LiPS member companies say its efforts will complement the lower-layer work of the OSDL on getting Linux to run on mobiles.

Cell phone and smart phone industry watchers say the use of Linux on such devices will be a big shakeup in the industry over the next several years, just as Linux has been a disruptive force in the server operating system market. Currently, most intelligent mobile devices run Symbian OS or Windows CE. But Linux on cell phones has made strides in Asia, where Motorola is reported to have sold more than three million Linux-based phones.

The fact that LiPS is in existence, and considering some of its powerful members, is a good sign for the future of Linux on cell phones. It appears this industry has already gotten past the initial hurdle of shrinking the kernel to run on small devices; now comes the more advanced work of smoothing out the applications running on such gadgets.

Where are the Good Open Source Games?

Despite the impressive list of achievements of open source software, it can be argued that there have not been any world-class games created under the open source banner. Sure, several old games like Doom and Quake have been gifted to the open source community, but there are no comparable original creations in this area. One should not expect this situation to change anytime soon, because the open source development model does not make sense for game development.

The State of Game Development

On August 3, 2004, Doom 3 was officially released by iD software after four years of work by some of the most talented individuals in the gaming industry. Interviews with the development staff report that from early 2004 until the recent release, 80 hour work weeks were normal and Sunday was the only official day off in the iD offices.

It would be an understatement to say that things have changed in the gaming industry over the last twenty years. Doom 3 had a four-year development cycle and an all-star development team. This may be slightly atypical, but two-year development cycles and teams of 50 or more are commonplace these days. In 1984, the average Atari 2600 video game was created by one programmer in three months. A banner title might involve two or three programmers and an artist working over a six month period.

So why do games take so long to bring to market these days?

There are some obvious answers to this question. Games today are many times more complex than games were even a few years ago. Recreating every three-dimensional point of a complex cave environment is going to take an artist several orders of magnitude more time than dropping a few rough dots on an Atari 2600's 196x160 screen and calling it a cave environment. Similarly, producing a full 5.1 surround sound track for a modern game requires sound engineers and advanced programming libraries. Triggering a few blips and bleeps is much easier.

But there are also some less obvious reasons for longer development cycles. In the old days, a programmer with a text editor and a few programs could create an entire game. However, to create all the complex content and code required for a modern game, programmers and artists need powerful tools such as 3-d modelers and advanced debuggers. Unfortunately, programmers and artists often have to use general purpose tools that are not at all well suited to game development. And when domain-specific tools do exist, such as in console game development, the tools are often unstable and immature due to the short life span of any particular console system. A multi-platform console world further complicates development by multiplying all of the issues of developing for a single platform by the number of platforms on which you intend to deliver your game.

An excellent summary of these issues can be found in the article Game Development: Harder Than You Think by Jonathan Blow.

But through all of this, one very important thing hasn't changed much. In 2004, just like in 1984, most players buy a game, play it for a while, and then move on. With the exception of a few genres, the lifespan of a single title is very short. The number of hours required for a brand new player to finish "Super Mario Bros." and "Metal Gear Solid 2" are about equal. But the amount of man hours that went into the creation of each is not even comparable.

Open Source is not an Advantage in Game Development

It is clear that building a top-quality game is harder than ever. The amazing amount of work required, the short schedules, and the need for experts in many domains all combine to make game development one of the most challenging areas of software development. Will developing a game as open source make things easier?

Open source works best as a development model when the useful lifespan of an application is very long. It allows many users to benefit from the application and provides an opportunity for users to become volunteer developers, thus furthering the project. The continued interest of the public drives the developers long after personal interest or utility has faded. This is the state of maximum efficiency for open source and provides two huge advantages over closed development: Users give back to the project and developers can directly build top of all of the code that has gone before them. Unfortunately, neither of these advantages exist in a meaningful way for open source games.

Most games, by their very nature, have a relatively short lifespan. This is natural. A game provides the user with an experience, but ultimately the user moves on. Since a single user is only interested in the game for a short period of time, it is unlikely that they will contribute much back to the open-source project.

In a modern game, the majority of work is involved in creating the art and story assets, not the programming. While there are plenty of open source game engines around, the bulk of a game must be created from scratch. Creating world class art and music is hard and you can not build on what has gone before you in the same way that you can with software. You can take the code for an algorithm, improve it, and use it to solve a problem. You can't directly take a musical score from an older game, change a few notes, and have a better score. You will just have an odd piece of music that sounds like a poor version of the original.