Moving the frontiers of collaborative data science for public good

Armenia SDG Innovation Lab
4 min readApr 20, 2021

--

Long are gone the days when “unicorn” data scientists were a thing — now it’s all about teamwork and Open Source data collaborations for public good

At Armenia National SDG Innovation Lab our passion lies within mainstreaming innovation and evidence-based decision-making in public policymaking. Data is the backbone of any decision-making: mobile device generated data, remote sensing, and social media are used by the private sector for marketing, advertising, and better management. At the SDG Innovation Lab, we use the valuable insights from both conventional and non-conventional data sources to inform policy and advance data-driven governance, service delivery, and decision-making for sustainable development.

Machine Learning and Natural Language Processing are widely used for the major projects undertaken at the Lab, including the projects that ameliorate citizen-government correspondence, improve maternal health, develop sustainable tourism, bridge education to the labor market, and so much more. We use predictive models for language classification, data mining, time series forecasting, etc, and continuously enhance the SDG Lab’s data team’s capabilities through independent research and mapping of existing open-source data repositories for relevant machine learning datasets and models.

As machine learning and natural language processing are gaining momentum in almost every sector imaginable, companies realize the importance of having a whole team of people instead of the lone “unicorn” data scientists. Additionally, many realize the importance of collaborations with other data science teams to exchange discoveries, build shared know-how, and spot opportunities for improvement.

Tackling big challenges requires huge efforts, but there are just way too many problems to be handled alone. Ever since its inception, Armenia SDG Innovation Lab has been reinventing development sector practices by acting as a catalyst for brewing evidence-based and data-driven policy solutions via cross-sectoral collaboration. SDG Lab has been transforming the way that the private sector contributes to the public good. To determine how data can be leveraged to tackle some pertinent challenges as well as develop mechanisms to assist decision-makers in setting out large-scale policy questions, we came up with a new collaboration model that converges all our stakeholders, including other data science teams, around a common shared outcome.

Collaboration with other data science teams is critical because it enables a mutually rewarding opportunity to leverage community knowledge and open-source datasets for positive public change. Moreover, in our case it also means doing cutting-edge public policy analysis and having unprecedented data access for experimentation.

Building on our experience and recognizing the private sector as a driving force for sustainable development, the Lab came up with a new partnership initiative with the DeepPavlov project team — the Neural Networks and Deep Learning Laboratory at Moscow Institute Physics and Technology.

DeepPavlov researches in the field of deep neural network architectures for working with text in natural language and is open to sharing their NLP/NLU tools with a wider data science community. Our data scientists use the resources made available by the DeepPavlov project team daily, and DeepPavlov’s open-source repository as an invaluable facilitator of the work at the SDG Innovation Lab. We make extensive use of the DeepPavlov multi–language BERT Named Entity Recognition model, using it in combination with the Stanford NLP morphological parser to extract information from Armenian — language text.

Furthermore, the SDG Lab team takes advantage of the Transformer Classifier Model implemented by the DeepPavlov team using Pytorch, in combination with the most robust multi–language transformer model entitled XLM-RoBERTa. DeepPavlov provides access to end-to-end vanguard NLP pipelines, including modules for preprocessing, that render our workflow much more streamlined and object oriented.

The partnership with DeepPavlov helps to make the Lab’s machine learning workflow much more streamlined and object-oriented. Notably, the resources of DeepPavlov will pave the way for a more sustainable and institutionalized partnership between the two institutions. The shared passion of pushing the boundaries of AI in both teams is something we are building on to partner with technology pioneers and apply their know-how in the Lab’s pursuit of SDGs and beyond.

What lies ahead?

  • Using the resources made available by DeepPavlov, we are currently building tools to automate real — time citizen-Government written correspondence classification and beyond.
  • Temporal knowledge graphs: a new exciting area for collaboration coming in the future.

Private institutions are converging towards the achievement of development results and one of the SDG Lab’s priorities is to facilitate this convergence, leading to a shared responsibility in development challenges.

We are just getting started, and we have already learned heaps along the way: our commitment to enhance the reach and impact of our projects can be enhanced by shared passion and enthusiasm. So if you think you are with us on this, join us on this journey and get in touch!

--

--

Armenia SDG Innovation Lab

World’s first National SDG Lab — accelerating #ADS2030 and #SDGs implementation in Armenia. Joint initiative of the Government of Armenia & the UN.