Toward a learned project-specific fault taxonomy: application of software analytics

This position paper argues that fault classification provides vital information for software analytics, and that machine learning techniques such as clustering can be applied to learn a project- (or organization-) specific fault taxonomy.

Anecdotal evidence of this position is presented as well as possible areas of research for moving toward the posited goal.

Test case analytics: Mining test case traces to improve risk-driven testing

In risk-driven testing, test cases are generated and/or prioritized based on different risk measures. For example, the most basic risk measure would analyze the history of the software and assigns higher risk to the test cases that used to detect bugs in the past. However, in practice, a test case may not be exactly the same as a previously failed test, but quite similar. In this study, we define a new risk measure that assigns a risk factor to a test case, if it is similar to a failing test case from history. The similarity is defined based on the execution traces of the test cases, where we define each test case as a sequence of method calls.

We have evaluated our new risk measure by comparing it to a traditional risk measure (where the risk measure would be increased only if the very same test case, not a similar one, failed in the past). The results of our study, in the context of test case prioritization, on two open source projects show that our new risk measure is by far more effective in identifying failing test cases compared to the traditional risk measure.

How we resolve conflict: an empirical study of method-level conflict resolution

Context: Branching and merging are common activities in large-scale software development projects. Isolated development with branching enables developers to focus their effort on their specific tasks without wasting time on the problems caused by other developers’ changes. After the completion of tasks in branches, such branches should be integrated into common branches by merging. When conflicts occur in merging, developers need to resolve the conflicts, which are troublesome. Goal: To support conflict resolution in merging, we aim to understand how conflicts are resolved in practice from large-scale study. Method: We present techniques for identifying conflicts and detecting conflict resolution in method level.

Result: From the analysis of 10 OSS projects written in Java, we found that (1) 44% (339/779) of conflicts are caused by changing concurrently the same positions of methods, 48% (375/779) are by deleting methods, 8% (65/779) are by renaming methods, and that (2) 99% (771/779) of conflicts are resolved by adopting one method directly. Conclusions: Our results suggest that most of conflicts are resolved by simple way. One of our future works is developing methods for supporting conflict resolution.

MARFCAT: Fast code analysis for defects and vulnerabilities

We present a fast machine-learning approach to static code analysis and fingerprinting for weaknesses related to security, software engineering, and others using the open-source MARF framework and its MARFCAT application. We used the NIST’s SATE IV static analysis tool exposition workshop’s data sets that included popular open-source projects and large synthetic sets as test cases.

To aid detection of weak or vulnerable code, including source or binary on different platforms the machine learning approach proved to be fast and accurate to for such tasks where other tools are either much slower or have much smaller recall of known vulnerabilities. We use signal processing techniques in our approach to accomplish the classification tasks. MARFCAT’s design is independent of the language being analyzed, source code, bytecode, or binary.

Testing analytics on software variability

Software testing is a tool-driven process. However, there are many situations in which different hardware/software components are tightly integrated. Thus system integration testing has to be manually executed to evaluate the system’s compliance with its specified requirements and performance. There could be many combinations of changes as different versions of hardware and software components could be upgraded and/or substituted. Occasionally, some software components could even be replaced by clones. The whole system after each component change demands to be re-tested to ensure proper system behavior.

For better utilization of resources, there is a need to prioritize the past test cases to test the newly integrated systems. We propose a way to facilitate the use of historical testing records of the previous systems so that a testcase portfolio can be developed, which intends to maximize testing resources for the same integrated product family. As the proposed framework does not consider much of internal software complexity, the implementation costs are relatively low.

Social Skills?: There?s an app for that

No, really, there is. The iPad app store features about a dozen applications designed to help children on the autism spectrum develop everyday social skills such as recognizing facial expressions, reading body language, and initiating conversations. Because these applications are relatively new, there’s more hype than evidence regarding their effectiveness, but parents and autism researchers are understandably hopeful that this new technology will enable these kids to navigate more easily through our socially demanding world.

But, like a drug that has the opposite effect when ingested by the wrong person, addictive reliance on technology is causing a debilitating decline in social skills in the vast majority of our younger population.

Subjective Evaluation of a Semi-Automatic Optical See-Through Head-Mounted Display Calibration Technique

With the growing availability of optical see-through (OST) head-mounted displays (HMDs) there is a present need for robust, uncomplicated, and automatic calibration methods suited for non-expert users. This work presents the results of a user study which both objectively and subjectively examines registration accuracy produced by three OST HMD calibration methods: (1) SPAAM, (2) Degraded SPAAM, and (3) Recycled INDICA, a recently developed semi-automatic calibration method. Accuracy metrics used for evaluation include subject provided quality values and error between perceived and absolute registration coordinates. Our results show all three calibration methods produce very accurate registration in the horizontal direction but caused subjects to perceive the distance of virtual objects to be closer than intended. Surprisingly, the semi-automatic calibration method produced more accurate registration vertically and in perceived object distance overall.

User assessed quality values were also the highest for Recycled INDICA, particularly when objects were shown at distance. The results of this study confirm that Recycled INDICA is capable of producing equal or superior on-screen registration compared to common OST HMD calibration methods. We also identify a potential hazard in using reprojection error as a quantitative analysis technique to predict registration accuracy. We conclude with discussing the further need for examining INDICA calibration in binocular HMD systems, and the present possibility for creation of a closed-loop continuous calibration method for OST Augmented Reality.

CORONET: Testbeds, demonstration, and lessons learned [invited]

The DARPA Core Optical Networks (CORONET) program envisions a highly dynamic network environment requiring fast provisioning and restoration for a wide variety of bandwidth-on-demand services in IP-over-optical networks. This paper builds on previously reported work on the development of fast provisioning and restoration protocols to meet CORONET requirements. Two separate CORONET testbed implementations are described: an emulation testbed and a cloud application testbed. The emulation testbed is a software implementation of a 100-node, global scale control plane network designed to investigate and validate the performance of CORONET provisioning and restoration protocols for the transport layer.

The cloud application testbed is implemented on a four-node network in AT&T’s Software Defined Network (SDN) Wide Area Network (SWAN) testbed and is designed to show the utility of CORONET fast provisioning for cloud computing applications. In particular, we demonstrate dynamic provisioning of inter-datacenter capacity for virtual machine load balancing under the control of an SDN orchestrator. “Lessons learned” and further challenges are summarized.

There is a Will, There is a Way: A New Mechanism for Traffic Control Based on VTL and VANET

Traffic light is regarded as one of the most effective ways to alleviate traffic congestion and carbon emission problems. However, traditional traffic light cannot meet the challenges in traffic regulation posed by the fast growing number of vehicles and increasing complexity of road conditions. In this paper, we propose a dynamic traffic regulation method based on virtual traffic light (VTL) for Vehicle Ad Hoc Network (VANET).

In our framework, each vehicle can express its “will” – the desire of moving forward – and share among one another its “will” – value and related traffic information at a traffic light controlled intersection. Based on the traffic information collected in real time, the virtual traffic light in our scheme can be adaptive to the changing environment. We conducted a number of simulation experiments with different scenarios using network simulator NS3 combined with traffic simulator SUMO. The results demonstrate the viability of our solution in reducing waiting time and improving the traffic efficiency.

Adaptive Lookup Protocol for Two-Tier VANET/P2P Information Retrieval Services

Intelligent transportation system (ITS) services have attracted significant attention in recent years. To support ITS services, architecture is required to retrieve information and data from moving vehicles and roadside facilities in an efficient manner. A two-tier system that integrates low-tier vehicular ad hoc networks (VANETs) and a high-tier infrastructure-based peer-to-peer (P2P) overlay, which can achieve a high lookup success rate and low lookup latency for information retrieval, has been developed. However, conventional information lookups in the two-tier VANET/P2P system may introduce extra lookup messages and latencies because the lookup queries are simultaneously performed over the VANET/P2P networks.

This paper proposes an adaptive lookup protocol for the two-tier VANET/P2P system to improve the efficiency of information retrieval. The proposed protocol uses a Bloom filter, which is a space-efficient data structure, to collect reachability information of road segments; therefore, adaptive routing of queries between low- and high-tier networks according to reachability probability can be employed. Simulations based on the SUMO traffic simulator and QualNet network simulator demonstrate that compared with the conventional two-tier lookup mechanism, the adaptive lookup protocol can reduce the lookup latency by 12%, reduce the P2P lookup overhead by 20%-33%, and achieve a high success rate in information lookups.