Improving the Reading Experience with On-Device Content Distillation
In today’s digital age, accessing news and information is primarily done through smartphones and desktop web browsers. However, the clutter and complexity of websites can make reading and navigating articles difficult, especially for individuals with accessibility needs. To address this issue, Android and Chrome users have access to the Reading Mode feature, which enhances accessibility by customizing contrast, text size, and font, as well as enabling text-to-speech capabilities. Android’s Reading Mode can even distill content from apps. But expanding Reading Mode’s capabilities without compromising privacy is a challenge.
To tackle this challenge, Google Research has developed an innovative on-device content distillation model. Unlike previous methods that were limited to news articles, our model excels in quality and versatility across different types of content. We ensure that the article content stays within the user’s device to protect privacy. Our model transforms long-form content into a simplified layout for a better reading experience and outperforms other approaches.
Using Graph Neural Networks
Instead of using complicated heuristics that are hard to scale, we approached the task as a supervised learning problem. This data-driven approach allows the model to generalize better across different layouts. Previous methods relied on HTML or parsing techniques, but our model utilizes accessibility trees, which provide a streamlined representation of the webpage’s structure. These trees are generated from the DOM tree and are used by assistive technologies to make web content accessible.
We manually collected and annotated accessibility trees from Android and Chrome datasets. We developed a novel tool that uses Graph Neural Networks (GNNs) to distill essential content from the trees. GNNs are ideal for analyzing tree-like data structures because they can learn the connections naturally without the need for manual feature crafting. By feeding the tree structure into the model, GNNs can discern the relationships and identify non-essential sections more accurately.
The Model Architecture
Our model follows the encode-process-decode paradigm using a message-passing neural network. The tree structure of the article is the input to the model, and lightweight features based on bounding box information, text information, and accessibility roles are computed. The GNN propagates each node’s representation through the edges of the tree to share information and enhance the model’s understanding of the page’s structure. After a fixed number of message-passing steps, the latent representations of the nodes are decoded into essential or non-essential classes.
A Lightweight and Privacy-Focused Solution
To ensure broad generalization across languages and preserve privacy, we limited the feature set used by the model. Our final lightweight Android model has 64k parameters and is 334kB in size, with a median latency of 800ms. The Chrome model has 241k parameters, is 928kB in size, and has a median latency of 378ms. By processing the data on the device, we never transmit user data externally, prioritizing user privacy.
We trained the GNN for about 50 epochs and evaluated the model’s performance on webpages and native applications. The results show high precision, recall, and F1-score for essential content, headlines, and main body text.
The quality of the distilled content is remarkable, with an F1-score exceeding 0.9 for main-text (paragraphs). This means that 88% of articles are processed without missing any paragraphs. Over 95% of readers find the distilled content valuable, perceiving it as pertinent and accurate.
Google Research has developed an on-device content distillation model that greatly improves the reading experience. By leveraging Graph Neural Networks and accessibility trees, our model accurately distills essential content and enhances the readability of articles. With a lightweight design and focus on privacy, this model provides users with a seamless and private reading journey.