As part of my GRA position with the Texas Advanced Computing Center, I worked with Dr. Weijia Xu and Wei Luo to develop a new approach to importing data into a Hadoop cluster. Our approach modified the Hadoop code for sequential data import to make it parallel. Our approach reduced the amount of time needed to import data into a Hadoop cluster by up to 80%.
We presented our work at the XSEDE 2012 conference in Chicago, Illinois.