10 Bits: the Data News Hotlist
This week’s list of data news highlights covers December 10-16, 2016 and includes articles about a machine learning system that can translate brainwaves and a project in Chicago to evluate green infrastructure with the Internet of Things.
Microsoft has released its new Microsoft Translator app that uses machine learning to translate spoken conversation to nine different languages simultaneously, allowing groups of up to 100 people speaking different languages to talk to each other. When someone asks a question in English, for example, the app will translate the question into the languages of the conversation participants, and translate their responses back into English for the initial speaker. Microsoft has already used the app to help the charity Children’s Society in London work with impoverished refugee and immigrant children, as well as to help non-English speakers apply for identification cards in New York.
The U.S. Department of Transportation (DOT) has published its long-awaited proposed rule to mandate the installation of vehicle-to-vehicle (V2V) communication technology in new cars. Implementing the technology would allow connected vehicles to share data with each other while on the road to make driving safer, such as warnings when an upcoming car is braking or if an oncoming vehicle is running a red light. After DOT finalizes the rule, which could take up to a year, automakers will have two years to install the technology in half of all of their new vehicles, and then another two years to install the technology in all of them.
Researchers at the Helsinki Institute for Information Technology have developed a machine learning technique that analyzes data about a person’s brainwaves as they read to detect when they have come across a concept that interests them. The researchers’ system analyzes data from electroencephalogram (EEG) sensors as subjects read Wikipedia articles and can identify keywords in the articles that cause a specific change in brain electrical activity, indicating a subject’s interest in a certain topic. Unlike other methods for gauging reader interest in a topic, the researcher’s technique does not involve measuring any physical characteristics, such as eye movement.
Financial technology company Cobalt DL is developing a system that uses distributed ledger technology, more commonly known as blockchain, to consolidate data about a currency market’s activity into a single repository to help regulators and financial firms better understand sudden crashes. Financial markets already compile this data, but store it in multiple systems that multiple authorities must verify to ensure they are accurate, making it difficult to quickly analyze, for example, why a currency is suddenly crashing. By consolidating all of this data into a single, authenticated repository, firms and regulators can reduce costs and better respond to flash crashes.
City Digital, a Chicago-based government, private sector, and academic partnership, has launched a pilot project to use connected sensors to evaluate the effectiveness of new green infrastructure projects in Chicago. Once the city finishes the construction of these projects in spring 2017, which include permeable pavement installations and bioswales—areas designed to filter silt and pollution from runoff water—the project will analyze sensors collecting data about soil moisture, humidity, chemical absorption, and other factors in six of these sites. With this data, the city can determine how cost-effective different types of green infrastructure are and predict the viability of new infrastructure installations.
The European Union Intellectual Property Office (EUIPO) has partnered with startup TrademarkVision to use the company’s machine learning image recognition platform to power an image search for its image trademark system. The platform allows users to enter an image of a logo and compare it to any similar trademarked logos in EUIPO’s database, similar to how search engines such as Google offer reverse image search. Traditionally, tracking down similar logos is challenging because many governments use systems of keywords and codes to annotate logos, but these systems are not uniform and can be difficult to understand.
Several legal research services, such as Bloomberg Law and Lex Machina, have begun offering analytics services that use historical data to predict how specific judges are likely to decide on upcoming cases. For example, Ravel Law is digitizing Harvard Law School’s library of case law and using machine learning to identify connections between cases and the languages judges use determine what kind of factors are likely to sway a certain judge, such as how a certain judge will be less receptive to a lawyer’s argument if it involves a sports analogy. With these services, law firms can identify and avoid pursuing frivolous lawsuits likely to be dismissed, or better inform their clients about the likely outcome of their cases.
Amazon has launched a trail of its autonomous flying drone delivery system in the United Kingdom and has made its first customer delivery. The drones, which navigate by GPS and sensors that can detect potential collisions, will be able to deliver packages weighing 2.3 kilograms or less within a 30 minute radius. Amazon will continuously gather performance data on the drones throughout the pilot to identify opportunities for improvement and to ensure safety.
The U.S. National Geospatial-Intelligence Agency (NGA) and the General Services Administration (GSA) have launched the Commercial Initiative to Buy Operationally Responsive GEOINT (CIBORG) program to make it easier for NGA to purchase unclassified satellite data from the private sector to supplement its own imagery. When NGA gathers satellite data, it has previously been required to use its own satellite resources, which can be limited and inflexible. Through the CIBORG program, NGA could use GSA contracting guidelines to get this data from a commercial provider instead.
The U.S. Department of Defense (DOD) has launched a new open data website called data.mil designed to encourage third-party use of its data and promote the publication of open data throughout the agency. DOD built data.mil with the help of startups LiveStories and data.world, which focus on building communities around open data, to ensure that data published on the site can be used to tell interesting stories with its data and encourage collaborative use of its datasets.
Image: Chris Hope.