Monthly Archives: June 2017

Provides readers with detailed summaries

From Reddit to Quora, discussion forums can be equal parts informative and daunting. We’ve all fallen down rabbit holes of lengthy threads that are impossible to sift through. Comments can be redundant, off-topic or even inaccurate, but all that content is ultimately still there for us to try and untangle.
Sick of the clutter, a team from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) has developed “Wikum,” a system that helps users construct concise, expandable summaries that make it easier to navigate unruly discussions.
“Right now, every forum member has to go through the same mental labor of squeezing out key points from long threads,” says MIT Professor David Karger, who was senior author on a new paper about Wikum. “If every reader could contribute that mental labor back into the discussion, it would save that time and energy for every future reader, making the conversation more useful for everyone.”
The team tested Wikum against a Google document with tracked changes that aimed to mimic the collaborative editing structure of a wiki. They found that Wikum users completed reading much faster and recalled discussion points more accurately, and that editors made edits 40 percent faster.
Karger wrote the new paper with PhD students Lea Verou and Amy Zhang, who was lead author. The team presented the work last week at ACM’s Conference on Computer-Supported Cooperative Work and Social Computing in Portland, Oregon.
How it works
While wikis can be a good way for people to summarize discussions, they aren’t ideal because users can’t see what’s already been summarized. This makes it difficult to break summarizing down into small steps that can be completed by individual users, because it requires that they spend a lot of energy figuring out what needs to happen next. Meanwhile, forums like Reddit let users “upvote” the best answers or comments, but lack contextual summaries that help readers get detailed overviews of discussions.
Wikum bridges the gap between forums and wikis by letting users work in small doses to refine a discussion’s main points, and giving readers an overall “map” of the conversation.
Readers can import discussions from places such as Disqus, a commenting platform used for publishers like The Atlantic. Then, once users create a summary, readers can examine the text and decide if they want to expand the topic to read more. The system uses color-coded “summary trees” that show topics at different levels of depth and lets readers jump between original comments and summaries.

Help make a ubiquitous model of decision processes

Markov decision processes are mathematical models used to determine the best courses of action when both current circumstances and future consequences are uncertain. They’ve had a huge range of applications — in natural-resource management, manufacturing, operations management, robot control, finance, epidemiology, scientific-experiment design, and tennis strategy, just to name a few.
But analyses involving Markov decision processes (MDPs) usually make some simplifying assumptions. In an MDP, a given decision doesn’t always yield a predictable result; it could yield a range of possible results. And each of those results has a different “value,” meaning the chance that it will lead, ultimately, to a desirable outcome.
Characterizing the value of given decision requires collection of empirical data, which can be prohibitively time consuming, so analysts usually just make educated guesses. That means, however, that the MDP analysis doesn’t guarantee the best decision in all cases.
In the Proceedings of the Conference on Neural Information Processing Systems, published last month, researchers from MIT and Duke University took a step toward putting MDP analysis on more secure footing. They show that, by adopting a simple trick long known in statistics but little applied in machine learning, it’s possible to accurately characterize the value of a given decision while collecting much less empirical data than had previously seemed necessary.
In their paper, the researchers described a simple example in which the standard approach to characterizing probabilities would require the same decision to be performed almost 4 million times in order to yield a reliable value estimate.

Database queries could prevent customer profilin

Most website visits these days entail a database query — to look up airline flights, for example, or to find the fastest driving route between two addresses.
But online database queries can reveal a surprising amount of information about the people making them. And some travel sites have been known to jack up the prices on flights whose routes are drawing an unusually high volume of queries.
At the USENIX Symposium on Networked Systems Design and Implementation next week, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory and Stanford University will present a new encryption system that disguises users’ database queries so that they reveal no private information.
The system is called Splinter because it splits a query up and distributes it across copies of the same database on multiple servers. The servers return results that make sense only when recombined according to a procedure that the user alone knows. As long as at least one of the servers can be trusted, it’s impossible for anyone other than the user to determine what query the servers executed.
“The canonical example behind this line of work was public patent databases,” says Frank Wang, an MIT graduate student in electrical engineering and computer science and first author on the conference paper. “When people were searching for certain kinds of patents, they gave away the research they were working on. Stock prices is another example: A lot of the time, when you search for stock quotes, it gives away information about what stocks you’re going to buy. Another example is maps: When you’re searching for where you are and where you’re going to go, it reveals a wealth of information about you.”
Honest broker
Of course, if the site that hosts the database is itself collecting users’ data without their consent, the requirement of at least one trusted server is difficult to enforce.
Wang, however, points to the increasing popularity of services such as DuckDuckGo, a search engine that uses search results from other sites, such as Bing and Yahoo, but vows not to profile its customers.
“We see a shift toward people wanting private queries,” Wang says. “We can imagine a model in which other services scrape a travel site, and maybe they volunteer to host the information for you, or maybe you subscribe to them. Or maybe in the future, travel sites realize that these services are becoming more popular and they volunteer the data. But right now, we’re trusting that third-party sites have adequate protections, and with Splinter we try to make that more of a guarantee.”