10 Bits: the Data News Hotlist
This week’s list of data news highlights covers March 18-26. 2017, and includes articles about a new method for transmitting data more efficiently accross undersea cables and a sensor designed to ensure fruit stays fresh in transit.
Researchers at the University of Rochester have developed an AI system that was able to learn racist code words in social media posts that other hate speech monitoring systems would likely dismiss as innocuous. Racist social media accounts have taken to using the names of brands as slurs, such as by replacing “Jew” with “Skype” or “Muslim” with “Skittle,” to evade automated moderating tools. The researchers trained their system on 250,000 tweets likely to contain hate speech and learned to identify words commonly associated with this speech, but not technically hate speech itself, and predict whether or not a tweet using these words was masking hate speech.
Researchers from Sookmyung Women’s University and Yonsei University in Seoul have developed a machine learning system that can help autonomous vehicles quickly analyze street signs regardless of weather or light conditions. Autonomous vehicles typically analyze street signs by using a camera to identify their shapes, colors, or other physical features. The researchers’ method uses these techniques in addition to analyzing the relative reflectiveness of different features of a sign. This approach allows an autonomous vehicle’s computer to decipher street signs faster than traditional methods but without a significant increase in processing power.
Facebook has successfully tested a method for transmitting data across an undersea cable that allowed it to transmit 250 percent more data over the same amount of bandwidth as traditional methods. Facebook used a new transmission technique called probabilistic constellation shaping, which modulates signals sent over fiber optic cables in a way that optimizes signal quality without the need for increased transmission energy, to transmit the data across the America Europe Connect undersea cable running from New York to Ireland, spanning 3,400 miles. This approach could help data-intensive services operating around the world make substantially more efficient use of existing Internet infrastructure.
YouTube has launched a sound effect captioning feature that uses AI to automatically identify sound effects in videos, which could improve viewing experiences for the deaf and hard of hearing. Developers trained an artificial neural network on thousands of hours of video to teach it to identify different non-speech sounds, including applause, music, and laughter, as well as identify when these sounds occur simultaneously. Now, YouTube’s closed captioning feature will display tags for these effects in real-time alongside its automatic speech transcription.
A team at the Swiss Federal Laboratories for Materials Science and Technology has developed a device modeled after fruit and containing a temperature sensor designed to ride along fruit shipments to gather data that could help distributors reduce spoilage and improve their logistics. The team used computer modeling and 3D printing to develop devices that mimic the physical properties of the flesh of oranges, apples, bananas, and mangos, so the sensor inside will accurately reflect the temperature of a real fruit. Should a shipment of fruit spoil, a distributer could analyze this data to figure out where in their supply chain the fruit was stored at the wrong temperature.
The Canadian government has announced a new initiative called the Pan-Canadian Artificial Intelligence Strategy designed to make Canada more competitive in AI research and development. The government has pledged $125 million CAD to support the strategy, which will focus on increasing national collaboration around AI, attracting and retaining academic talent, and making Canada a more attractive destination for companies that want to invest in AI.
Seattle startup Headset has developed business intelligence analytics services for the cannabis industry, which is legal in Washington state, by using data that cannabis retailers and medical dispensaries are already legally required to collect and report. The services integrate with point-of-sale systems and aggregate sales data with inventory data, allowing businesses to better understand what products are popular, when during the day they get more customers, and other factors to allow them to make more informed purchasing, staffing, and other business decisions.
NASA is developing a satellite called the Laser Communications Relay Demonstration (LCRD) that can encode data onto a beam of light, which can be used to transmit data 10 to 100 times faster than traditional radio-frequency-based space communications. This technology will allow NASA to increase the data-transmission rates between spacecraft to speeds up to a gigabit per second. NASA will launch LCRD in summer 2019.
Researchers at Harvard University have developed a diagnostic tool that relies on a user’s smartphone to measure male fertility. A smartphone attachment converts a phone’s camera into a microscope, and a corresponding app analyzes video taken by the microscope of a semen sample to calculate sperm count and motility, which indicate fertility. The tool, which costs just $5 to make, was able to identify abnormal samples, which could indicate infertility, with 98 percent accuracy.
Ministers from France, Germany, Italy, Luxembourg, the Netherlands, Portugal, and Spain have launched a joint initiative called EuroHPC to build up the European Union’s high-performance computing (HPC) capacity. EuroHPC participants have committed to acquiring at least two exascale computers—HPC systems capable of a billion billion calculations per second, by 2023, and making them available as a resource for researchers, industry groups, and the public sector throughout the EU.
Image: Petr Kratochjvil.