Home Articles Representing ride check survey data in a GIS The case study of...

Representing ride check survey data in a GIS The case study of Cape Town, South Africa

Dondo C.
Department of Geomatics

Rivett. U. Dr.
Department of Civil Engineering
University of Cape Town 7701 Rondebosch
Cape Town, South Africa
e-mail: [email protected]
e-mail: [email protected]

It is essential for local authorities and public transport service providers to carry out public transport surveys regularly. These assist in the collection of data for use in public transport planning. The South African government, recognising the importance of acquiring and maintaining such information, came up with a policy that makes it compulsory for every local authority to have a current public transport record. To collect data for the current public transport record, one of the most commonly used surveys is a ride check survey. In order to get more accurate data form these surveys, some local authorities have adopted the use of global positioning systems (GPS), for accurate position location and hand-held digital computers for electronic data recording. Since most of the local authorities already have their data in GISs and use GIS, it becomes imperative to be able to store this survey data in the GIS. However, the huge amounts of data collected need to be stored efficiently, by minimizing data redundancy. This paper focuses on the development of a data model to efficiently include ride check survey data into a GIS. Implementation of the data model is based on the object-relational database technique and the dynamic segmentation method.

1. Introduction
It is essential for local authorities and public transport service providers to carry out surveys on public transport usage regularly. The data collected from these surveys is instrumental in the planning and design of public transport facilities and in planning for future use of public transport (Macpherson 1993).

The methods of data collection have evolved from manual methods to electronic methods and at the same time many local authorities are adopting the use GIS for transport information management and service planning. With continuing advances in technology, the extent, accuracy and amount of collected data is limitless. The only challenge lies in trying to organise this data so that it fits in with the rest of the local authorities’ spatial and attribute data in the GIS and serves its intended purpose.

The objective of this paper is to investigate the representation of ride check survey data in a GIS. The next section provides an overview of how ride check surveys are conducted in Cape Town, South Africa. This will be followed by a discussion on how and why positioning systems and handheld digital computers have been adopted for use in these surveys. The rest of the paper then reports on the design and implementation of a suitable data model to include such data in a GIS.

2. Background
Recognising the importance of surveys to public transport service planning and information management, the government of South Africa has set up policies that make it compulsory for every local authority to maintain a current public transport record. This current public transport record should give an overview of the extent of public transport services and the availability and location of public transport facilities. It should also contain information on current public transport usage statistics and public transport users preferences and needs (Department of Transport 2001). In order to acquire data on current public transport usage statistics and public transport users preferences and needs, each transport authority has to carry out public transport surveys. In Cape Town, South Africa, one of the most common surveys used to collect this data are ride check surveys (Moving Ahead 2001). Ride check surveys are also considered more advantageous than other public transport surveys, which involve the distribution of questionnaires to passengers, because ride check surveys are based on the observation method of data collection. This method leads to the collection of more accurate data because it does not rely on the respondent’s willingness and ability to respond to the questions (Wermuth et al 2001).

Traditionally, during ride check surveys in Cape Town, a surveyor would board a public transport vehicle for the duration of one trip. The surveyor would manually record, at every stop, the number of people boarding or getting off the vehicle, the method of payment of the ticket for the passengers boarding the vehicle, descriptive information on the position of the stop and the stop number. This would present a problem when the vehicle stops at any place that is neither an existing stop nor a terminus. There would be no location information on that stop available in the database. Any attempts to represent such a place on a map results in an inaccurate representation because an approximate position will be used. Sometimes, when visibility is poor, the surveyor would find it difficult to read off the stop number from the stop shelter in the short time that the vehicle stops to drop off or pick up passengers. The manual recording of the arrival and departure times at each stop and the counting of the people getting on and off the vehicle would also result in inaccurate data due to human error and fatigue.

To overcome these shortfalls, some local authorities in South Africa have adopted the use of automatic positioning systems for location positioning and electronic data devices for recording the data during these surveys. It is also important to note that similar problems are being experienced in other countries and that the adoption of these devices is not only occurring in South Africa but can be seen as a worldwide phenomenon (Shaw 1999 and Murakami et al 1997). An example of the use of this technology in South Africa is the use of global positioning systems (GPS) and palm pilots by the Cape Metropolitan Council in carrying out ride check surveys in the year 2000 (AfriGIS 2000).

During these ride check surveys, a surveyor was placed on a selected bus trip, armed with a GPS receiver and a palm pilot. At every stop, the surveyor recorded the following information:

  • The stop position in terms of x and y co-ordinates.
  • The arrival and departure times.
  • The number of passengers boarding and alighting.
  • The method of payment used (whether cash or clip card).

This leads to the generation of large amounts of data, which create the need for effective methods of data organisation, representation and visualization (Goodchild 1999 and Papacostas 1995).

As most local authorities and public service providers are already using GIS for transport information management and service planning, it becomes a natural choice for storing the ride check survey data. GIS provides capabilities for graphical presentations that make analysis of the data easier (Zhong 1998, Wang et al 2001, Attanucii et al 1999). It is also the preferred output format for any current public transport record according to the South African public transport policy (Department of Transport 2001).

In most cases, any attempts to include such survey data in the GIS often leads to the data being stored in flat files. This results in databases with lots of redundancies. In order to efficiently represent this kind of data in a GIS, a suitable data model has to be designed. The following sections will discuss on the design and implementation of a suitable data model for representation of ride check survey data in a GIS.

3. Data model design
The first stage in the design of a data model is the designing of a conceptual data model. A conceptual data model is a representation of the data to be included in the system and how it relates to each other. It can be viewed as just an idea of how the data is to be modelled in the system. It is a model that is independent of any software to be used for implementation and how the data will be stored in the computer (Laurini et al 1996).

The entity-relational model has become the most popular approach for representing conceptual data models and it will be adopted for use in this paper (Worboys 1995). An entity is a thing that can be uniquely identified, for example a specific car or a specific person. Entities can be grouped into classes of similar objects known as entity types. A relationship is an association among entities or entity types (Chen 1976). The notation to be adopted in this paper for representing these entities and relations is illustrated in Figure 1 below.

Forward: A person must have one car
Backwards: A car may be owned by one or more people

Figure 1: The entity-relational diagramming notation
3.1. The conceptual data model
Figure 2 below illustrates the conceptual data model for representing ride check survey data in a GIS.

Figure 2: The conceptual data model
The five entity types illustrated in Figure 2 above are road, public transport route, trip path, stop and trip. A road represents the network that is used daily by different modes of transport. A public transport route is a physical feature representing a path taken by a public transport vehicle travelling along the transport network.

A stop is a place where the vehicle stopped to drop off or pick up passengers. It can be an existing physical stop, terminus (collection of more than one stops) or just a place where the vehicle stopped during that particular trip (to be known as a trip stop in this paper). The trip path will then represent a path from one trip point to the next along a public transport route. A trip represents a one way routing from an origin stop to a destination stop. The distance travelled on each trip is equivalent to the distance travelled on one public transport route

Following each conceptual data model should be a set enterprise rules that govern the data (Howe 1989). The enterprise rules for the data model illustrated in Figure 2 above are as follows.

  • A road may have one or more public transport routes passing through it.
  • A public transport route may pass through one or more roads.
  • A public transport route may have one or more trip paths.
  • A trip path must be in one public transport route.
  • A trip path must begin and end at a trip point.
  • A stop may be the beginning or end of one or more trip paths.
  • A stop may have one or more trips assigned to it.
  • A trip may be assigned to one or more stops

3.2. The logical data model
The conceptual data model shown in Figure 2 above is just a schema of how things appear to the designer or user. This has to be converted into a model that can be implemented in any software. This model is the logical data model. Leading up to the development of this model is the application of theoretically sound database design principles to the conceptual data model. Figure 3 below illustrates the logical data model resulting from the conceptual data model discussed in the previous section.

As can be seen in Figure 3 below, since a lot of software products cannot support many to many relationships, these are decomposed into one to many relationships by introducing a relationship table between each of the entities participating in the many to many relationship, for example the road and public transport route relationship. All the unique identifiers in each table are shown underlined. The posted identifiers are also shown for each relationship.
Figure 3: Logical data model
4. Implementation of the logical data model
The logical data model represented in Figure 3 above was implemented in ArcInfo 8 using the geodatabase data model, which is an object-oriented data model. All the entities illustrated in the conceptual and the logical data models are treated as objects in the geodatabase. Unlike entities, which have attributes only, an object, besides having attributes, has a set of operations that it will perform under appropriate conditions and these define its behaviour. Suffice to say that implementation in the geodatabase, in some texts, has also been described as an object-relational approach. This is due to the fact that although the entities are treated as objects with behaviour, they are still implemented as tables with relationships between them, as in the relational data model (Harrington 2000).

Inheritance is one of the main properties of object-orientation that is essential for modelling complex data in a GIS and is the most used concept in the ArcInfo geodatabase model. ArcInfo provides a hierarchy of object classes that are ready for use. By specifying which standard ArcInfo object classes the objects from the designed data model inherit their behaviour from, custom behaviour can be bestowed on these created objects. ArcInfo standard classes that are of importance to the implementation of the data model presented in this paper are (Egenhofer 1992 and Zeiler 1999):

  • The object class – an ArcInfo standard class for representing objects with no spatial representation. The trip and trip path are modelled as object classes.
  • The feature class – an ArcInfo standard class for representing objects with a spatial representation. Examples are lines, points, multipoints and polygon. The public transport route entity is represented by lines created using sections of the arcs making up the roads through the process of dynamic segmentation. The trip path entity is then modelled as an event table of the public transport route. Points represent the stop entity.
  • The simple edge feature class- an ArcInfo standard class representing lines making up a network is used to represent the road entity.
  • The relationship class- an ArcInfo standard class that represents the relationships among the different objects tables. The relationships between all the objects are modelled as relationship classes.

Using the concept of inheritance, subtypes are used to represent stops. As was mentioned earlier, a stop can be a physical existing stop, a termini or any point where the vehicle stopped on that particular trip. The mode of public transport used for implementing the data model was the buses. The bus stop, bus termini and trip stops are subtypes of trip point (see Figure 4 below).

All the subtypes will inherit the attributes of stop, in addition to their own attributes. The use of subtypes also enables the different subtype in each table to be displayed using different symbols.

Figure 4: Use of inheritance and subtypes in classifying trip points into bus termini, bus stop and trip stop
5. Presentation of results
The ride check survey data can be presented at route section level or stop level as will be shown in Figures 5 and 6 below.

Figure 5 below shows the number of people who were travelling along a section of a route during the ride check survey. The numbers are represented using varying line widths, the bigger the width, the more the number of passengers. In the attribute table, information can be acquired about the stop from which the route section started and the stop at which it ended and the date of the survey. The trip stop subtypes are displayed using different symbols.

Figure 5: Visual representation of the ride check survey data in a GIS
Figure 6 below shows graphs at each stop representing the number of passengers boarding or getting off the bus.

Figure 6: Graphs at each stop showing the number of people getting on or off the bus
In Figure 7 below are graphs drawn at each stop showing the number of passengers who paid for their ticket either using cash or using a clip card.

Figure 7: Number of boarding passengers at each stop paying either using cash or card
6. Conclusions
Many local authorities and public transport service providers have been using GIS for managing their information and performing analysis and modelling. In most cases, the main challenge lies in organising this data effectively in order to minimise data redundancy and maximise query and analysis processing performance. This paper presented a data model that can be used to assist in managing of ride check survey data in a GIS. The data model presented in this paper can be extended and used to represent other transport survey data.

The use of an object-oriented data model results in the creation of objects with custom behaviour. For example, with the object-oriented geodatabase model, a line representing a road and one representing the boundary of a building can be differentiated because of the different behaviour and characteristics of each of them. The use of subtypes enables the display of different types of any object using different symbols. In some cases, you have to separate the spatial data into different layers in order to display the features using different symbols (see Figure 5 above).

Indeed, the use of GIS in storing, providing a platform for analysis and presenting public transport data provides a whole range of benefits to transport authorities. However, without a proper data model to represent and integrate the different kinds of spatial and attribute data, some of its capabilities and benefits are not fully realised.


  • AfriGIS. 2000. The CMC and AfriGIS Breaking new ground with TransCAD GIS Software. Press Release [Online]. Available: https://www.afrigis.co.za [12 June 2002]
  • Attanucci, J. P., Halvorsen R. 1999. What GIS can do for transit planning. [Online]. Available: [15 July 2002]
  • Chen, P.P.S 1976. The Entity Relationship Model-Towards a unified view of data. Association of Computing Machinery Transactions on Database Systems. 1(1), pp 9-36
  • Department of Transport (2001) NLTTA: TPR4: Non-metropolitan Current Public Transport Record. [Online]. Available: [20 April 2002]
  • Egenhofer, M. J. 1992. Object-Oriented modeling in GIS. URISA Journal. 4(2): pp 3-19
  • Goodchild, M. F. 1999. ‘GIS and Transportation: Status and Challenges,’ in Proceedings International Workshop on Geographic Information Systems for Transportation (GIS-T) and Intelligent Transportation Systems (ITS). Hong Kong [Online]. Available: [12 December 2001]
  • Harrington, J. L. 2000 Object-Oriented Database Design Clearly Explained. Morgan Kaufman publishers. Academic Press.
  • Howe, D. R. 1989. Data Analysis for Database Design. Edward Arnold. Second Edition.
  • Laurini, R., Thompson D. 1996. Fundamentals of Spatial Information Systems. Academic Press Limited. Fifth Printing
  • Macpherson, G. 1993. Highway and Transportation Engineering and Planning. Longman Scientific and Technical.
  • Moving Ahead. 2001. City of Cape Town Transport Plan, Part 2: Public Transport-Operational Component. City of Cape Town: Creative Services.
  • Murakami, E., Wagner, D. P., Neumeister, D. M. 1997. Using Global Positioning Systems and Personal Digital Assistants for Personal Travel Surveys in the United States. [Online] Available; https://www.nas.edu/trb/publications/ec008/session_b.pdf [15 November 2002]
  • Papacostas, C. S. 1995. GIS Applications to the Monitoring of Bus Operations. University of Hawaii. [Online]. Available: https://www.eng.hawaii.edu/~csp/Mygis/busgis.html [12 December 2001]
  • Shaw, S. L. 1999. ‘Handling Disaggregate Travel Data in GIS,’ in Proceedings International Workshop on Geographic Information Systems for Transportation (GIS-T) and Intelligent Transportation Systems (ITS). Hong Kong [Online]. Available: [12 December 2001]
  • Wang, D., Cheng, T. 2001. A spatio-temporal data model for activity-based transport demand modeling. International Journal of Geographical Information Science 15 (6), pp. 561-585.
  • Wermuth, M., Sommer, C., Kreitz, M. 2001. ‘Impact of new technologies in travel surveys,’ in Proceeding International Conference on Transport Survey Quality and Innovation, South Africa [14 November 2002]
  • Worboys, M.F. 1995. GIS: A Computing Perspective. Taylor and Francis Limited.
  • Zeiler, M. 1999. Modelling Our World. The ESRI Guide to Geodatabase Design. Environmental Systems Research Institute Inc.
  • Zhong, R.P. 1998. Internet GIS and its Applications in Transportation. TR News March-April 1998