Machine learning is a part of artificial intelligence and is the study of how computers/machines learn tasks such as recognition, planning, prediction, robot control, game strategy, etc. The techniques used in machine learning include neural networks, decision trees, statistical learning, reinforcement learning and inductive logic programming. There are many existing applications of machine learning techniques and many potential future uses to investigate, as well as many theoretical questions worth studying.
Many non-classical logics have been introduced for both theoretical and applied computer science, for example, intuitionistic logic, modal logics, substructural logics and fuzzy logics. In conjunction with these logics there exist classes of algebraic structures that serve as models and are useful for the study the logics, in the way that boolean algebras are models for classical logic. We consider mainly theoretical problems for such logics and algebras such as decidability, complexity, axiomatization and structural properties. Lattice theory and universal algebra underpin the algebraic methods.
Computational intelligence is a part of artificial intelligence. Its principal areas of research are fuzzy logic, neural networks, genetic algorithms and combinations of these. Computational intelligence has been applied extensively in various places and there are many potential future uses to investigate, as well as theoretical questions worth studying.
Formal language theory is a key concept in computer science, both in theory and practice. It had its beginnings in fields such as biology, electrical engineering, logic, and linguistics, while its development was largely driven by the need for formal specifications of programming languages and the design of compilers. Due to the latter, the majority of work in formal language theory has been done for string languages. However, during the last decades, formal language theory has begun to play an increasingly important role in the generation of objects other than strings, such as graphs, pictures, and trees, and corresponding types of grammars have been developed.
Random context picture grammars (rcpgs) are a method of syntactic picture generation. Most such methods are either context-free or context-sensitive. The former are elegant, but weak, while the latter are powerful, but difficult to prove theorems for. Rcpgs lie between these extremes—they are strictly more powerful than context-free grammars, but it is possible to develop characterization theorems for them and many of their subclasses. Such theorems have enabled us to establish relationships between various subclasses, but there are still open questions. Generalized random context picture grammars (grcpgs) are, as the name indicates, a generalization of rcpgs. It has been proven that these grammars are strictly more powerful than Iterated Function Systems (IFSs), which are widely used to generate fractals. This relationship opens up new avenues for research, eg., determining whether or not theorems that have been proven for IFSs can be generalized to grcpgs.
Together with Dr Frank Drewes of Umea University, Sweden, Sigrid Ewert obtained a research grant for 2009–20011 for a project titled Tree Automata in Computational Language Technology. The grant funds research visits between the collaborative partners and their students. The purpose of the collaboration is the further development of the theory of tree automata with special emphasis on applications in computational language technology (CLT). The development of such technology is particularly important for South Africa, because it has 11 official languages, all of which are mother tongues for some citizens. It also has a high illiteracy rate. In order to address this situation as quickly and efficiently as possible, technology needs to be developed that can, for example, translate textbooks from English into the other 10 languages. The theory of tree automata is well suited for applications in CLT, because information in CLT is usually represented by tree-structured data. Various types of tree grammars and transductions are known, but thus far mainly the context-free case has been studied. By adding random context or bag context, the power of these models is much extended and opens up new avenues for research. Sigrid Ewert obtained a research grant for 2008–20011 for a project titled Random Context in the Generation and Transformation of Trees. The grant funds research visits between the grant-holder and her research collaborators. The project aims to make a thorough theoretical study of various classes of tree grammars and tree transducers extended by the mechanisms of random context or bag context, and their application to domains such as, eg., the generation and transformation of strings, pictures, or graphs.
The cost of energy utilized at large data centers has recently become an issue of paramount concern. Long term retention of vital information has been done typically on a tape-based mass storage systems. The trend now is towards systems that are either entirely disk-based or predominantly disk-based thereby enabling access to data at on-line or near-line speeds. Disk-based mass storage systems are now feasible due to the availability of low-cost reliable disk drives technology such as Serial ATA (SATA). A typical mass storage system, even today, involves thousands of disk drives. The large number of spinning drives has created a large and growing energy usage concern. Given the trend of rising global fuel and energy prices and the high rate of data growth, the challenge is to implement appropriate configurations of mass storage systems that include multi-tier disk systems which are not only energy efficient but meet the data access response time requirements of applications. The Energy Efficient Storage System research is intended to address this challenge. The primary goal is to investigate, implement and deploy modules for the management of reliable low-cost disk-based storage systems that maximize the energy savings while meeting the performance goals of the storage systems with respect to access latency, reliability and operational costs. The basic idea is to utilize the concepts of a massive array of idle disks (MAID). The array of disks are partitioned into active and passive groups. The active group consists of continuously spinning disks while the passive group consists of disks that are powered down after some period of inactivity and powered up when accessed at a cost of longer latency delay. By using the active disks as a cache and directing accesses to be concentrated on the active disks, accesses to passive disks are minimized. Research issues that arise concern how the disk pool should be partitioned into active and passive groups; what scheduling policies should be used for processing data requests; what energy aware cache replacement policies and algorithms should be used; and what appropriate protocols (iSCSI, FC), should be used for interfacing applications to the storage. The research activities also concerns the development of analytical and experimental simulations models for analyzing energy consumptions of different configurations of disks and the impact of Solid State Disks when included in the mix of the total storage used.
Developing countries, primarily in Africa, Latin America, Far East, etc., have been targeted as those most likely to benefit tremendously from Course Management Systems (CMS), or Learnimg Management Systems (LMS). While this may well be the case, the contents of the course presentations are primar- ily PowerPoint slides with little or no lively interactions explaining concepts being presented. Delivery content in LMS as PowerPoint presentations does not make understanding of scientific and engineering concepts anymore simpler to understand for students encountering new concepts that are not the day- to-day jargon of their native tongue or alien to their cultural lifestyle. Scientific concept animations is more desirable when mathematical, scientific and engineering concepts have to be explained through eLearning in environments where these concepts are not easily related practical realities. The objective of this research work is to promote the development and implementation of lecture material of scientific concepts with controlled animation. Unlike lecture presentations delivered by video, these lecture notes are developed as PDF files with controlled fast-forward, playback and animated illustrations. Originally derived from works in attempts to animate algorithms using the Java Programming Language the tools used in this project are Latex, Beamer, PGF/TikZ, MetaPost, InkScape, Dia, etc. The presentation slides produced are PDF files generated from overlays of diagrams that can be played and controlled to provide the illusion of dynamic and animated illustration of scientific concepts. Such lectures, similar to Power- Point lectures, can be downloaded by students on any portable devices with PDF readers replayed many times over.
Distributed computing is now quite pervasive in the provision of both local, and world wide services. Our research programme investigates foundations and novel applications of technologies for the development of distributed services. We focus on foundations of Service-Oriented Computing, Autonomic, Peer-to- Peer Context-Aware and Mobile Computing. General techniques used involves exploit locality in sharing and decision-making with the aim toward improving performance and fault-tolerance. Application areas targeted are primarily in information dissemination, synchronization, buffer management, and peer-to- peer systems. More specifically, we address problems dealing with:
Image processing typically involves the computer manipulation of digital images (photographs, etc.) with the intention of developing an understanding of the content of the images. It has application in many areas. Urban sprawl (formal and informal) needs to be monitored to enable better town planning. This monitoring is data intensive and thus the process of automatically categorizing/processing the data is essential. Various image processing techniques can be applied to process this data but further work 17 is required to enable fully automated image segmentation, analysis and understanding. Some work is currently being done on monitoring of informal settlements in the Cape Town area.
Recursion is known to be difficult to teach and many students do not fully understand the process. A student’s knowledge of recursion is their mental model of recursion. A mental model is viable if it allows the student to accurately and consistently represent the mechanics of recursion. Recent research at Wits has identified a number of mental models of recursion and shown that many students do not build viable mental models. In addition, it has been shown elsewhere that students in later years do not apply recursion when problem solving. This work builds on previous research by considering questions like: Do the same models occur in the current first year class? How can these models be used for diagnostic teaching? What models do senior students have and do they use their knowledge of recursion? Another interesting question arises from the way we teach recursion. We show the students lots of example traces using (predominantly) the “boxes method” of tracing which is essentially conveying a “copies model”. This should, at the very least, help the students to do traces of recursive algorithms using a “copies model” approach. The question is whether the students actually gain a deeper understanding of recursion or whether they have just been given a process to follow. Some progress has been made in answering this question but more work could be done.
Disruptive features, such as potholes and faults, in gold and platinum bearing reefs impact negatively on mining – increasing costs and reducing safety. Using borehole radar the area just ahead of mining can be mapped to detect these features. The borehole radar data currently requires an experienced geophysicist to interpret them and this interpretation must often be done off-site. By the time the interpreted results are sent back to the mine, the area in question has already been mined out – sometimes with serious consequences. The aim of this research project is to consider ways of automatically interpreting the borehole radar data so that the interpretation can be done quickly on-site.
CYCLE PICKING is a decision problem which asks whether it is possible to find a collection of cycles of size J (from the set of all cycles in a directed graph) which includes every node in the graph in at least one of the chosen cycles. CYCLE PICKING has been shown to be NP-Complete by a transformation from MINIMUM COVER. This project will look at finding approximation algorithms for CYCLE PICKING. (Note: Other problems related to CYCLE PICKING are also NP-Complete and the project could investigate approximation algorithms for these problems as well.)
Hyperspectral images are made up of pixels whose spectral information is much richer than ordinary images. A digital camera will associate three values with each pixel — the amounts of red, green and blue. A satellite imager like Landsat (multispectral) has seven values associated with each pixel — the amounts of red, green, blue, three readings in the infrared, and one in the thermal. This allows infor- mation to be deduced about the amount and health of vegetation and some crude mineral information. But a hyperspectral imager will record as many as 1000 different values for each pixel! This gives a tremendous amount of information about the material being photographed, but also produces huge data sets which need special management and processing. This leads to many interesting problems in terms of algorithms and information extraction for such data sets. Hyperspectral data has a wide variety of applications including forestry, vegetation surveys, envi- ronmental monitoring around mining areas, and target detection applications. Techniques used include algorithm design and complexity, numerical methods, optimisation procedures and parallelisation. An active use of hyperspectral data is in the mining industry because many minerals can be identified with the data provided. Below is an image showing a section of mining core — pretty nondescriptive to the naked eye — and then the varied automatically generated products from the hyperspectral data. The data is obtained by the Hyperspectral Core Imager designed for, and owned by, AnglogoldAshanti. The instrument generates 1Gb of data for every three meters of core. Often there are hundreds of meters of core!
Multi-agent systems are systems of interacting components that exhibit characteristics usually associated with the concept of agency (such as possessing knowledge, communication with other components, and planning their actions). Such systems are often used in critical applications, where the price of failure is exceedingly high. When dealing with such mission-critical applications, one needs to either design from scratch a system conforming to given specifications or to check that a given system works as intended. Logic is used as a formal tool for fulfilling these tasks. I’m working on logics that can be used to design or verify the functionality of multi-agent systems.
Facial expressions convey non-verbal cues, which play an important role in interpersonal relations. Au- tomatic recognition of facial expressions can be an important component of natural human-machine interfaces; it may also be used in behavioral science and in clinical practice and it is crucial for commu- nication using sign languages. Although humans recognize facial expressions virtually without effort or delay, reliable expression recognition by machine is still a challenge. Facial expressions are supplementary to gestures in the usage of sign language for communication. South African Sign Language (SASL) has gained interest in the recent years by the researchers around South Africa. It is important to study, whether the basic frameworks for expression recognition are able to handle the challenges towards recognizing SASL expressions.