Data Engineer


We are pleased to announce the following vacancy in the Big Data and Business Analytics department within the Finance Division.  In keeping with our current business needs, we are looking for a person who meets the criteria indicated below.

Brief Description.

Reporting to the HOD – Big Data and Business Analytics, the position holder will design, build and deliver on the big data platform which will serve as Safaricom’s single source of truth. The platform will be used to continuously deliver on Safaricom’s overall data analytics strategy. Safaricom is investing heavily in big data and this will be a truly exciting role in view of the organizations unique data set and position in this region.

Role Responsibilities

Design, architect and build solutions and tools for the big data platform
Mediate and coordinate resolution of software project deliverables using agile methodology
Develop pipelines to ingest data into the big data platform based on business demands and use cases
Develop analytical platforms that will be used to avail data to end users for exploration, advanced analytics and visualizations for day-to-day business reporting
Provide guidance and advise to technology teams on the best use of latest technologies and designs to deliver a best-in-class platform in the most cost-effective way
Develop automated monitoring solutions to be handed over to support teams to run and operate the platform efficiently
Automate and productionize data science models on the big data engineering platform
Perform technical aspects of big data development for assigned applications including design, developing prototypes, and coding assignments.
Build analytics software through consistent development practices that will be used to deliver data to end users for exploration, advanced analytics and visualizations for day-to-day business reporting.
Plan and deliver highly scalable distributed big data systems, using different open-source technologies including but not limited to Apache Kafka, Nifi, HBase, Cassandra, Hive, MongoDB, Postgres, Redis DB etc.
Code, test, and document scripts for managing different data pipelines and the big data cluster.
Receive escalated, technically complex mission critical issues, and maintain ownership of the issue until it is resolved completely.
Hands on to troubleshoot incidents, formulate theories and test hypothesis, and narrow down possibilities to find the root cause.
Develop tools, and scripts to automate troubleshooting activities.
Drive further improvements in the big data platform, tooling and processes.
Upgrading products/services and applying patches as necessary.
Maintaining backup and restoring the ETL and Reports repositories and other Systems binaries and source codes.
Build tools for yourself and others to increase efficiency and to make hard or repetitive tasks easy and quick.
Develop machine learning algorithms and libraries for problem solving and AI operations.
Research and provide input on design approach, performance and base functionality improvements for various software applications.


BS or MS in computer science or equivalent practical experience
At least 2-3 years of coding experience in a non-university setting.
Experience in Object Oriented development
Proficient understanding of distributed computing principles
Experience in collecting, storing, processing and analyzing large volumes of data.
Experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming
Experience with various messaging systems, such as Kafka or RabbitMQ
Experience with NoSQL databases, such as HBase, Cassandra, MongoDB
Understanding of big data technologies:  Cloudera/MapR/Hortonworks
Highly proficient in more than one modern language, e.g. Java/C#/NodeJS/Python/Scala.
Experience with relational data stores as well as one or more NoSQL data stores (e.g., Mongo, Cassandra).
Experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming.
Demonstrated proficiency with data structures, algorithms, distributed computing, and ETL systems.
Experience with various messaging systems, such as Kafka or RabbitMQ.
Good knowledge of and experience with big data frameworks such as Apache Hive, Spark,
A working knowledge and experience of SQL scripting.
Experience in deploying and managing Machine Learning models at scale is an added advantage.
Hands on implementation and delivery of apache Spark workloads in an Agile working environment is an added advantage.


Click “APPLY FOR JOB” button above to apply for this job.

About Safaricom Kenya

Safaricom is a leading communications company in Kenya with the widest and strongest coverage. The home of the famous Mobile Money service- M-PESA