Skip to main content

Week 2 :- CERN-HSF @ GSoC 2018

This week was quite a busy one as most of the development for the first deliverable (next week) took place during this time. 

Firstly, we should be aware why are we developing an alternative database using ElasticSearch. The main reason we are having huge amounts of data coming every day which itself a large number of parameters. So, using ES is quite beneficial here since we can create indexes on a daily/monthly basis and also query the same with more flexibility than MySQL.

I was majorly focussed and covered the following tasks:

  •  Create a wrapper over the ElasticSearchDB.py for JobParameters with methods: 'get' and 'set'. 
    • The function 'get' majorly handles the fetching of the parameters from the ElasticSearch (ES) backend using the ElasticSearch-py API 'query'. The function takes in a jobID (which is like a primary key, though no keys in ES) and an optional list of names if needed to query the ES.
    • The function 'set' is used to update or insert the values of JobParameters in the ES backend using the ES-python APIs 'update-by-query' and 'index'. First priority would be to first find whether the given jobID exists and if not found it is inserted to ES index.
  • Add a flag in the JobMonitoringHandler to activate ElasticSearch backend as mentioned in the configuration file 'dirac.cfg'.    
    • The basic operation is to read the flag 'useES' from the configuration file (introduced now) and based on its bool value, activate the backend.
  • Add an update method to the ElasticSearchDB.py which was initially not present. It is a wrapper over the ES-python APIs 'update-by-query' and 'index' so that the functions outside the script can't access the ES-py module directly.
This was all done for this week and a commit was made with respect to the same. Find the commit here. Tasks can be tracked for the next in the task manager (check sidebar).

Comments

Popular posts from this blog

Week 10 :- CERN-HSF @ GSoC 2018

This week, I started with the work that was optional and beyond the scope of the work as per the proposal. I started adapting the codes I modified or wrote for simultaneous support for Python2 and Python3. This needed first going through the guides which explain the porting process as well as following the guides as prescribed by the LHC project at CERN. I used the python-modernize library which supported this modification along with keeping the original support. Using this module, I modified most of the files involving ElasticSearch support as well as some of the codes I introduced/modified as part of the earlier works described in various posts of the blog. The series of commits can be found in Pull Request 3765 . Note: The work still needs to incorporate due to various other issues in automatic porting methods.

Week 9 :- CERN-HSF @ GSoC 2018

This week I was involved with working on Pull Request 3744  where some changes were demanded by other members of the organization as per the requirements of other components of the project. Firstly, as we moved towards adding a Jobs Status table (discussed in another post of the blog), which has eased and increased efficiency for query processing, we need to take care of the modules that accessed that table during any of there functions. Hence, it became important to analyze and test every module available in the Workload Management System as well as other modules that are accessing the table. A string of commits can be found below for the work done during this week: 0b1a63594eec932c12669d3274d5da705d9e2ffb fdda6f1fa473f866f0b552b838761f1201bc3c5b All the commits related to this issue can be found in this Pull Request .

My GSoC experience with CERN-HSF, Summer 2018

The Google Summer of Code (GSoC) program is one of the most prestigious programs for student developers, who are eager to work and demonstrate their skills in a full-fledged working environment. The program gives this opportunity to students all around the world and I feel good that I was selected for this program with the CERN-HSF organization, one of the pioneer organizations for nuclear research. My experience started before the GSoC program, with the first contact with the current mentor, Federico Stagni where we discussed the proposed project and my suitability for it by performing some tasks that are available here . Based on performance and my timelines, I submitted my proposal which went through multiple revisions with inputs from my mentor Federico and self-improvements. Finally, almost after a month of wait, students selected for GSoC 2018's program were announced and found myself fortunate to be selected with the following project with CERN-HSF:-  Monitoring and...