Software and Tools

Below are some of the software and tools developed within the context of LASIGE projects.

Adaptare, a framework for automatic and dependable adaptation in dynamic environments.

AgreementMakerLight is an automated and efficient ontology matching system derived from AgreementMaker that has been in development since the beginning of 2013. It is open source and freely available through GitHub both as an Eclipse project and as an executable jar.

Like its namesake, AgreementMakerLight has a flexible and extensible framework, but places more emphasis on efficiency so as to tackle large ontology matching problems. It is primarily based on the use of element-level matching techniques supported by background knowledge.

AguiaJ is an pedagogical environment for experimenting object-oriented programming in Java. The tool enables users to interactively test object classes by creating and manipulating objects and visualizing the result of such interaction in terms of object state. The notion of encapsulation is metaphorically embodied in the environment, by restricting visibility/access to class/package members. Another innovative aspect of the tool is the capability of being extended with model-view plugins that enable users to visualize certain objects with custom views, enriching user experience and enabling introductory programming exercises to deal only with objects as conceptual descriptions of artifacts that are visualized in the tool.

Appia is an open source layered communication toolkit implemented in Java and providing extended configuration and programming possibilities. The Appia toolkit is composed by (1) a core that is used to compose protocols and (2) a set of protocols that provide group communication, ordering garanties, atomic broadcast, among other properties.

Team:

aspa: a patching tool for JVM class files. aspa derives and applies patches between Java classes compiled to the Java Virtual Machine (JVM) bytecode format, resorting to an abstract syntax tree representation of JVM bytecode.

B3PP (Blood Brain Barrier Penetration Prediction) is a web application that interfaces a machine learning model designed to predict the Blood-Brain Barrier penetration properties for any organic molecule.

Compounds can be entered with their common name (in english), or a valid SMILES string or InChI identifier. The output is constituted by the model result, some other common names for the submitted molecule, when available, and a graphical representation of the molecule generated by the Chemical Identifier Resolver. Also provided, for completeness, are the a priori probabilities of crossing the Blood-Brain Barrier according to the molecular weight.

This model is valid only for molecules with a molecular weight lower than 600 Da.

BFT-SMaRt is a high-performance Byzantine fault-tolerant state machine replication library developed in Java with simplicity and robustness as primary requirements. Our main objective is to provide a code base that can be used to build dependable services and also extended to create new protocols.

Bica is an extension of the Java language that enables the verification of Java programs against a session type specification.
This specification represents the changes in the interface of an object.
In Java, the interface of an object is the set of methods declared in the object's class, with visibility taken into account.
Session type specifications provide a more flexible way to specify changes in the interface of an object.
The session type is included in the source code for a class as a Java annotation.
Types are extended with session type information, and Bica verifies that clients of a class use it as specified by the session type.
Bica is implemented using the Polyglot framework, as an extension of the jl5 language extension. That is, it supports Java 5 source code.

CESSM (Collaborative Evaluation of GO-based Semantic Similarity Measures) is an online tool for the automated evaluation of GO-based semantic similarity measures, that enables the comparison of new measures against previously published ones in terms of performance against sequence, Pfam and EC similarity.

CIDS (Citation Impact Discerning Self-citations) is a user-friendly web tool that calculates different citation statistics, such as the h-index and g-index.

CMPSim is a web tool that implements a novel approach to measure the similarity between chemical compounds and metabolic pathays using semantic similarity.
Information about Chemical Compounds is gathered from the Chemical Entities of Biological Interest and information about Metabolic Pathways from the Kyoto Encyclopedia of Genes and Genomes database.
Using the ChEBI ontology and the chemical annotations of KEGG pathways, CMPSim is capable of measuring the similarity between chemical compounds and metabolic pathways.

ConGu is a tool that supports the checking of Java classes against property-driven algebraic specifications. Checking classes consists in determining, at run-time, whether the classes that are subject to analysis behave as required by the specification. The first version of the tool, available here, reduce the problem to the run-time monitoring of contract annotated classes, which is supported today by several runtime assertion-checking tools. In ConGu 2.0, which is also applicable to generic datatypes implemented by generic Java classes, the run-time monitoring of the implementations is achieved through bytecode instrumentantion and use of assertions. ConGu 2.0 is available in the QUEST plug-in for the Eclipse IDE.

Cooperari is a tool for cooperative testing of multithreaded Java applications.

DepSky is a system that improves the availability, confidentiality and integrity of stored data in the cloud. It reaches this goal by encrypting, enconding and replicating all the data on a set of differents clouds, forming a cloud-of-clouds. For the current implementation of the system and for the text below we consider a cloud-of-clouds formed by four clouds.

DepSpace (Dependable Tuple Space) is fault and intrusion-tolerant secure tuple space implementation. The main objective of the system is to provide an extended tuple space abstraction that could be used to implement Byzantine fault tolerant applications.

The DNA privacy detector is a method that systematically detects privacy-sensitive DNA segments coming directly from an input stream, using as reference a knowledge database of known privacy-sensitive nucleic and amino acid sequences.

The Epidemiology Ontology aims at increasing the amount of epidemiological data available, improving disease surveillance systems, and promoting the collaboration among epidemiological researchers. It currently covers the transmission of infection and epidemiology parameters and measures areas.

GIN, Genome Inspector, was devised to design primers within microbial species or genus as broad as possible, for PCR gene-screening projects. It uses BLAST to find DNA sequences highly similar to the gene or sequence chosen by the user, in complete sequenced microbial genomes (GenBank format only). When an identical match is found, it draws a circular map showing the position of the highest score match, registers its position on a table and shows the BLAST alignments. The sequences found by BLAST will be used to perform a multiple sequence comparison by log-expectation (MUSCLE). The user will easily evaluate if the conserved found sequences will give, or not, broad range primers depending on the % of genomes in the multiple alignment.

GRYFUN (GRaph analYzer of FUNctional annotation) allows the visualization, filtering and subsequent analysis of Gene Ontology (GO) annotation profiles. A GO annotation functional profile consists of the collection of GO terms that annotate a given set of proteins. (e.g. protein family). GO comprehends three orthogonal ontology aspects - biological process, molecular function and cellular component. Each of these three GO aspects are structured as Directed Acyclic Graphs (DAGs).

GRYFUN's central visualization mode revolves around generating and displaying sub-graphs that subsume all the GO annotations for a given protein set. The represented DAGs are comprised of nodes and edges and whereas each node represents a GO term and the connecting directed edges represent taxonomic relationships between them. On the original GO DAGs these would be is_a relationships where starting from children and pointing towards parents, all converging into a common root parental term. However, on GRYFUN the edge direction is reversed. Also, the edge thickness is proportional to the number of proteins annotated to any given (non-leaf node) that manage to be annotated by one of its children terms. This couple of features gives a visual cue on both term distribution and how overall the specificity of GO annotations within a given protein set. Additionally, relevant GO-based annotation and statistic metrics are also displayed. Moreover, this webtool enables the user to interactively create further subsets of proteins associated to terms by clicking their respective nodes on the DAG.

Team:

hsSim is an Extensible Interoperable Object-Oriented n-Level Hierarchical Scheduling Simulator.

JDNA is a referential compressor for aligned DNA files.

KeyMgr, Key Manager, is a software package that facilitates the management of individual logins for system administrators of Linux servers. In contrast with traditional approaches, keymgr implements a solution that is completely independent of remote authentication services. Each admin user owns a unique ssh key pair that he uses to login on each remote server. The head of the system administration team manages a file that associates each user to the servers he is allowed to login. keymgr puts the keys together and deploys them on the servers.

Team:

Lusica is a project of automatic collection of music quotes of Lusophone artists from the social networks.

The initial aim is to produce a history of the popularity of the typical Lusophone music styles (i.e.: fado, samba, etc.) on the social networks. In this way, the project aims to promote the dissemination of styles, artists and music throughout the community.

MIL

MIL (Multithreaded Intermediate Language) is an assembly language targeted at an abstract multi-processor equipped with a shared main memory. Each processor consists of a series of registers and of a local memory for instructions and for local data. The main memory is divided into a heap and a run pool. The heap stores data and code blocks. Data blocks are represented by tuples and may be local to the processor for its exclusive use, or stored in the heap and shared amongst processors. A code block declares the registers it expects (including the type for each register), the required locks, and an instruction set. The run pool contains suspended threads waiting for a free processor.
Read More +

Missinks (Missinks: finding the missing links) is a web application that given a search query identifies the links in the first two pages of a given Google country search engine (e.g. google.ca - Canada) that are not present in the first four pages of another Google country search engine (e.g google.pt - Portugal) search results.

The missing links may exist due to regional differences, or may just be a result of the data protection law in Europe being implemented by Google (Learn more). We advise you to manually curate the results before using it for any purpose, since the tool does not accept any responsibility or liability for the accuracy of the results.

mobIPLity produces bonnmotion traces from the eduroam records collected by the Instituto Politécnico de Lisboa (IPL). Its goal is to provide to the mobile computing research community realistic mobility scenarios.

Mool is a mini object-oriented language in a Java-like style with support for concurrency, that allows programmers to specify class usage protocols as types. The specification formalizes (1) the available methods, (2) the tests clients must perform on the values returned by methods, and (3) the existence of aliasing and concurrency restrictions, all of which depend on the object’s state.
Linear and shared objects are handled as a single category, allowing linear objects to evolve into shared ones. The Mool type system verifies that methods are called only when available, and by a single client if so specified in the usage type. Mool builds upon previous works on (Modular) Session Types and Linear Type Systems.
Read More +

NAMS (Non-contiguous Atom Matching Structural Similarity) is a free Webtool to calculate similarity between molecules based on the structural/topological relationships of each atom towards all the others within a molecule. This functionality allows the calculation of similarity between 2 molecules using their name, SMILES or InChI and setting several parameters that will influence the atom/bond matching similarity score. Disconnected fragments will be separated in order to keep only the main structure and molecules with less than 3 atoms cannot be processed. It is also possible the calculation of different similarity functions based on Fingerprints.

“O Mundo em Pessoa” is a collaboration project with Sapo Labs for the celebration of the 125th year of birth of Fernando Pessoa. The team developed an web app where it’s possible to analyze the impact of the poet’s work in the social networks.

"Onde e quem vai ver" (Where and who's going) is an app that allow the user to browse, create and share the best events.

OpenRQ is a Java library that implements the RaptorQ FEC scheme described in RFC 6330. The aim is to provide to developers a library that is easy to use and incorporate in their applications, whilst maintaining RaptorQ’s acclaimed performance and resilience. Forward Error Correction (FEC) is a technique for the recovery of errors in data disseminated over unreliable or noisy communication channels. The central idea is that the sender encodes the message in a redundant way by applying an error-correcting code, which allows the receiver to repair the errors. RaptorQ is a fountain code, which are a class of erasure codes with two attractive properties: an arbitrary number of encoding symbols can be produced on the fly, simplifying the adaptation to varying loss rates; and the data can be reconstructed with high probability from any subset of the encoding symbols (of size equal to or slightly larger than the number of source symbols).

PAMPA (Power-Aware Message Propagation Algorithm) is a broadcast algorithm for mobile ad hoc networks (MANETS). PAMPA tries to save resources of participants in MANETs by reducing the number of nodes that are required to retransmit a message so that it gets delivered to every node. The novelty of PAMPA is that the selection of the nodes that retransmit is based on the signal strength of the delivery while most of the remaining peek some nodes at random.

Team:

ParTypes is a toolchain for validating and synthesising message-based programs for Message Passing Interface (MPI) programs.
The general aim is to enforce program compliance with dependent-type based protocol specifications, enforcing properties such as protocol fidelity and the absence of deadlocks. The toolchain is composed of an Eclipse plug-in, an annotated MPI library, a C annotator, and makes use of the Verifying C Compiler (VCC), the Z3 SMT solver, and the Why3 platform.
A key result is that verification of C+MPI programs is immune to the state-explosion problem, and verification times are independent of input parameters such as the number of processes, contrasting with established methodologies for MPI program verification, e.g., employing model checking and/or symbol execution.
The Eclipse plug-in allows for the writing of protocol specifications, verifies that protocols are well formed (with the help of Z3 for checking dependent type restrictions), and generates protocol representations in VCC and WhyML formats for program verification. The plugin also synthesises C+MPI code programs that are correct-by-construction, also annotated with VCC logic that work as a proof of their correctness.

PESTT (PESTT an Educational Testing Tool) is an Eclipse plug-in for learning and designing unit tests for the Java language. Currently, PESTT supports unit tests based on the control and data flow graphs (CFG) of methods. It generates the CFG based on the source code of the method, allows for bidirectional linking between the source code and the generated CFG, generates test requirements, determines test paths, and computes the coverage level statistics. It provides full integration with JUnit and reconstructs run paths of each test method, computing statistics either for each test method or for the complete test set. PESTT distinguishes from other coverage analysis testing tools because it is not just a node and branch coverage analyzer. Among other things, it supports many other “more powerful” coverage criteria, like prime path coverage or all def-use paths coverage, and assists the test engineer in the various tasks of the testing activity.

ProPi is a tool to statically verify whether message passing programs are free from deadlocks. The tool takes as input a system specified in the pi-calculus, together with typing annotations that describe the communications in the channels, as well as event annotations so as to capture the overall ordering of the communications. The tool produces as result either a positive answer (the system type checks), in which case the properties of protocol fidelity (system communications follow the typing prescription) and progress (deadlock absence) hold. Otherwise, the system exhibits error information so as to allow to identify what is the problem in the system specification. The tool includes an Eclipse plugin and a standalone command line interface.

ProteInOn is a web tool focused on calculating GO-based protein semantic similarity. It features a stepwise query selection menu, which together with the possibility of selecting results as input for new queries, makes it flexible and customizable. It also incorporates data on protein interactions, allowing for comparative studies between protein similarity and interactions.

The tool can be used to compute semantic similarity between proteins or GO terms, using several different measures: the three "classic" term similarity measures (Resnik's, Lin's and Jiang and Conrath's) with or without the GraSM approach and using Best-Match Average combination of term similarities; as well as the novel simGIC.

It can also be used to find GO terms assigned to one or more proteins or GO terms representative of a set of proteins. A score is used to measure the representativeness of a term for a set of proteins, based on the number of proteins the term is annotated to and its probability of annotation.

Finally, it can be used to find proteins that interact with one or more other proteins, which enables the integration of knowledge on protein interactions with functional and processual knowledge: proteins that share a group of interactors should have a similar molecular function whereas a group of interacting proteins should be involved in similar biological processes.

This project’s objective is the development of mechanisms to take advantage of the knowledge that is in the crowd to validate news about football transfers. The project allows users to vote on news thus helping with their validation and taking advantage of social networks by sharing news we validate. It is also be possible to analyze the feelings of the crowd, and using the knowledge generated by this analysis to validate news about football transfers.

Real-Time Proactive Secret Sharing Library for RTAI is an implementation of the Shamir's secret sharing scheme and Herzberg's proactive secret sharing algorithm. The library uses a port of GNU GMP in order to make multiple precision arithmetic operatio

SASULisboa App is a app available for Android and iOS that enables all the students to check the menus and balance in refectories of the University of Lisbon.

SCFS is a cloud-backed file system that provides strong consistency even on top of eventually-consistent cloud storage services. Its build on top of FUSE, thus providing a POSIX-like interface. SCFS provides also a pluggable backend that allows it to work with a single cloud or with a cloud-of-clouds.

SePi is a concurrent, message-passing programming language based on the pi-calculus. The language features synchronous, bi-directional channel-based communication.
Programs use primitives to send and receive messages as well as offer and select choices. Channel interactions are statically verified against session types describing the kind and order of messages exchanged, as well as the number of processes that may share a channel. In order to facilitate a more precise control of program properties, SePi includes assume and assert primitives at the process level and refinements at the type level. Refinements are treated linearly, allowing a finer, resource-oriented use of predicates: each assumption made supports exactly one an assertion.

ThermInfo (Collecting, Retrieving, and Estimating Reliable Thermochemical Data) is a cheminformatics system designed and built with two main objectives in mind: collecting and retrieving critically evaluated thermochemical values, and estimating new data.

In its present version, by using chemically intelligent software, ThermInfo allows to retrieve the value of a thermochemical property, such as a gas-phase standard enthalpy of formation, by inputting, for example, the molecular structure or the name of a compound. The same inputs can also be used to estimate data (this feature is presently restricted to non-polycyclic hydrocarbons).

Future versions of ThermInfo will cover a wide range of (long-lived and transient) organic, inorganic, and organometallic molecules in the gas- and in condensed-phases. A variety of empirical methods, selected on the basis of their reliability to predict data, will be included. New estimation procedures, based on structure-energetics relationships and machine learning methods will be searched for.

ThermInfo involves a partnership between the Molecular Energetics Group of CQB (Centro de Química e Bioquímica) and LaSIGE (Large-Scale Informatics Systems Laboratory). The chemistry team has considerable expertise on a variety of experimental thermochemical techniques, on assessing thermochemical data, and on the development of prediction methods. The informatics team has extensive experience in web systems development, in particular biomolecular databases.

TryIt, learn and use Gloss tools straight from your web browser.

WAP

WAP 2.0 (Web Application Protection) is a source code static analysis and data mining tool to detect and correct input validation vulnerabilities in web applications written in PHP (version 4.0 or higher) with a low rate of false positives. This tool does taint analysis (data-flow analysis) to detect the input validation vulnerabilities. The aim of the taint analysis is to track malicious inputs inserted at the entry points of the web application ($_GET, $_POST arrays) and to verify if they reache some sensitive sink (PHP functions that can be exploited by malicious input). After the detection, the tool uses data mining to confirm if the vulnerabilities are real or false positives. At the end, the real vulnerabilities are automatically corrected with the insertion of the fixes (small pieces of code) in the source code.