10-Week Internship at Lawrence Berkeley National Laboratory.pdf (2.13 MB)
10-Week Berkeley Labs Undergraduate Research
Version 2 2018-07-26, 18:49
Version 1 2018-07-26, 18:47
journal contribution
posted on 2018-07-26, 18:49 authored by Edsel NorwoodEdsel Norwood, Shreyas CholiaShreyas CholiaDuring the summer of 2018 I took part in a 10-week undergraduate
research at Lawrence Berkeley National Laboratory (LBL). The department
in which I was working under was the Data Science and Technology
department, directed by Deborah Agarwal. During the course of the
research I was tasked with the development of user tools for a service
by the name of Environmental Systems Science Data Infrastructure for a
Virtual Ecosystem (ESS-DIVE). This included working on multiple
different project during my 10 weeks at LBL. ESS-DIVE is a data
archiving service that is funded by the U.S. Department of Energy (DOE).
The purpose of ESS-DIVE is to archive and publicly share data obtained
from observational, experimental, and modeling research that is funded
by the DOE’s Office of Science. The primary goal for the user tools
portion of the ESS-DIVE project is to fully implement an automated virus
scanning process using the VirusTotal API. This will remove the
possibility of a user uploading files that contain harmful or malicious
code. This is important because the ESS-Dive archive is hosted on LBL’s
National Energy Research Scientific Computing Center (NERSC) and
allowing users to upload potentially harmful files is a major security
risk. The secondary goal is to create a way for a user to create a
shared folder on the ESS-Dive shared endpoint via web form. The issue
that this web form intends to solve is once the VirusTotal API is
implemented into ESS-Dive any datasets that are larger than 35 megabytes
won’t be able to be sent to the VirusTotal API. This is due to a
restriction on VirusTotal’s side. With this functionality a user will be
able to fill out a form with their Globus information and create a
shared folder on the ESS-Dive shared endpoint, from there they can
upload their dataset to the endpoint and notify the ESS-Dive team once
they’re done. Once the ESS-Dive team has been notified that the user is
done uploading their data they can then manually submit the files to be
scanned with VirusTotal.