Applications closed for Outreachy Internships with the M-Lab team

Jordan McCarthy

and

Steph Alarcon

Aug. 11, 2015

Tuesday, August 11, 2015: We've received several new requests for information but we do not have open slots for the next round at the moment. If you'd like to contribute to M-Lab independently, please refer to this page for ways you can get involved. But at this time we are unable to guide new applicants through an introduction.

Monday March 23: Please note that we are not able to accept new initial contributions. If you have already been in touch with us, please feel free to pursue your application, but we are beyond capacity for supporting new contributions. Please refer to the Outreachy main program page for other projects that may be able to accommodate new contacts.

Welcome, Outreachy candidates! Measurement Lab (M-Lab) is excited to welcome candidates for Round 10, May 25-Aug 25, 2015! Please check out the Outreachy main program page for all the info you need for Round 10. It has deadlines, background information, eligibility requirements, other participating organizations, contact information for people who can help you with your application, and the application system. Below, you’ll find descriptions of projects we think might be interesting for Outreachy participants.

Application Process and M-Lab Contacts

Read up about the outreach program.
Look over our project ideas below. If nothing there strikes your fancy, feel free to suggest one of your own. We love new ideas!
Join the #oti or #measurementlab channels on irc.freenode.net and introduce yourself! Please address one or all of our mentors (technosopher, salarcon, georgiabullen) so we are sure to see your message.
Contact our mentors via email (outreachy@measurementlab.net) to let us know that you're interested in contributing. We'll talk about how to get started.
Submit your application through the application system. Please be available and responsive throughout the application period so we can work with you on making it as strong as possible.

About M-Lab

Measurement Lab (M-Lab) is an open, distributed server platform on which researchers can deploy open source Internet measurement tools. The data collected by those tools is released in the public domain. The goal of M-Lab is to advance network research and empower the public with useful information about their broadband and mobile connections. By enhancing Internet transparency, M-Lab helps sustain a healthy, innovative Internet.

Here are two of our flagship measurement and analysis tools, and a major research output from 2014:

NDT (Network Diagnostic Test) http://www.measurementlab.net/tools/ndt
Internet Observatory http://www.measurementlab.net/observatory
ISP Interconnection and its Impact on Consumer Internet Performance http://www.measurementlab.net/blog/2014_interconnection_report

We are a consortium of of research, industry, and public interest partners, one of whom is the Open Technology Institute (OTI) based at New America in Washington, DC. OTI strengthens communities through grounded research, technological innovation, and policy reform. We create reforms to support open source innovations and foster open technologies and communications networks. Partnering with communities, researchers, industry and public interest groups, we promote affordable, universal, and ubiquitous communications networks. OTI will be the primary mentors, in their third round of Outreachy participation.

Available projects

Rather than describing any one project in a lot of detail, we’re instead giving prospective Outreachy interns two broad categories of work to chose from. Some of these projects are pretty big. We will work with you to break them into achievable steps and figure out what can and can’t be accomplished during the internship period. If either or both of these categories appeal to you, send us a note, and we’ll work with you to sketch out the details of a specific initial contribution and final project!

Data analysis, visualization, and policy impacts

Description: Projects in this category will involve diving into the M-Lab dataset of measures of Internet performance, and figuring out how to transform this data into structures and stories that are more accessible and useful to policymakers, activists, journalists, and ordinary people.

Skills to bring or develop: database queries, Google BigQuery, documentation, finding stories in data, visualization, statistical analysis

OTI can offer: Sample queries and guidance on developing and documenting more queries, guidance on visualization tools, access to many friendly subject matter experts in policy and statistics

Sample initial contribution: Learn how to query the M-Lab dataset using Google BigQuery, and describe the process and the results. Does the data match what you expected? How did you decide how to structure the query? What problems did you run into while learning how to use BigQuery? What does the data mean and represent?

Sample project ideas

Data signal searching

Explore and develop methods to draw useful information out of the M-Lab data in ways that are reproducible and statistically robust. For example, the data might show significant dips in Internet performance in certain regions and times of day. We’d need a way to figure out whether those dips are strange anomalies that warrant more study, or if they’re just products of natural variation. That method would need to be trustworthy and reproducible by other network researchers.

Assess performance impact of major regulatory changes

Choose a particular Internet-related law or regulatory action that took place in a particular region (like this one in the European Union), and analyze its effects on Internet users in the affected region. On what measurements were the regulatory changes based? Were they valid? What do our measurements show?

National routing

Investigate, for a specific country or region, how often Internet traffic is being routed outside of the country/region, even when the final destination of that traffic is local. (This is a very hot topic right now). Optimally, produce a tool or script that makes this kind of analysis easier to do in the future.

Socieconomic geography of access & performance

How are economic privilege and geographic region correlated with Internet performance? Where is Internet service most affordable? Reliable? Priced fairly for the service provided? OTI has already done some work on this question by mashing up regional census data with regional M-Lab data, so a candidate could leverage and extend those methods and findings.

Network experimentation and coding of network tests

Description: Projects in this category will involve designing and implementing new methods for Internet measurement, which will help create a clearer and more accurate picture both of how well individuals are being served by their own internet connections, and of the health of the global Internet at large.

Skills to bring or develop: Familiarity with scripting languages like bash or Python; knowledge of networking protocols and testing utilities. Traceroute will likely be heavily used, but lots of other handy tools are available.

OTI can offer: We don’t have the capacity for heavy duty scripting training, but can do code reviews, help with troubleshooting and developing testing ideas, and we can connect you with other organizations and online resources that teach scripting.

Sample initial contribution: Write a script to automatically run a network test of your choosing (such as ping or traceroute) at regular intervals, and store the data in a file for later analysis.

Sample project ideas

A/B experiments and experiment validation

Create a script framework for repeatedly running specific network experiments under two different conditions, and then checking the results for meaningful differences. For example, it would be interesting to test how the same experiments perform over different transport protocols (IPV4 vs. IPV6), on different ports (80 vs. 443), and using different control protocols (ie, a custom binary protocol vs. websockets).

Mapping of major web services & content providers to transit provider networks

Create a framework for testing the routes and performance from end-users to big content providers over time.

For example, write a test to record traceroutes between the user and Amazon/Google/Twitter, etc. A good starting point for big content providers is the “Alexa 500”, a list of the 500 top websites.

These tests could:

Better investigate and explain performance problems that affect certain websites for certain subsets of their users. For example, why is Netflix access fast at work but slow at home? Why is it slow for my neighbor but fast for me?
Explain to consumers why they should care about peering. For example, “You get to Amazon over a network that’s suddenly incredibly crowded, which is why you’re having a hard time using that site.”
Detect sudden changes in network performancepeering arrangements, like the ones that happened between Netflix and a bunch of ISPs last February.

Improve user interfaces for running tests, making sense of results, and gathering metadata

Contribute to ongoing efforts to create more informative and streamlined user interfaces for running tests, and displaying relevant data from BigQuery once the test has completed (ie, results from the surrounding region, so the user has some sense of how his/her own results measure up). Some existing interfaces:

NDT (Network Diagnostic Test) http://www.measurementlab.net/tools/ndt
NDT Chrome app https://github.com/bchase78/ndt-chromeapp
Internet Observatory http://www.measurementlab.net/observatory

Applications closed for Outreachy Internships with the M-Lab team

Measurement Lab is participating in the Outreachy (formerly Gnome Outreach Program for Women) May-August 2015 round. Read more to find out how to apply!

Blog Post

Jordan McCarthy

Steph Alarcon

Aug. 11, 2015

Application Process and M-Lab Contacts

About M-Lab

Available projects

Data analysis, visualization, and policy impacts

Sample project ideas

Data signal searching

Assess performance impact of major regulatory changes

National routing

Socieconomic geography of access & performance

Network experimentation and coding of network tests

Sample project ideas

A/B experiments and experiment validation

Mapping of major web services & content providers to transit provider networks

Improve user interfaces for running tests, making sense of results, and gathering metadata