The Austin Hadoop User Group: 2012

Our next meeting will be on Thursday the 8th of March at Bazaarvoice from 6:30 - 9pm.

As usual we'll have plenty of Pizza, Beer and Tacos. This event is free and open to everyone. We have lots of people that come that are new to Hadoop and Big Data.

Agenda

6:30 - 7:00 : Meet and Greet (Austin's Pizza, Quality Beer and Tacos)

7:00 - 7:30 : "IronFan" - Flip Kromer, CTO, InfoChimps - @mrflip

Joe will be presenting on IronFan which was recently covered on GigaOm and Wired Enterprise. IronFan is a systems provisioning and deployment tool which automates not only machine configuration, but entire systems configuration to enable the entire Big Data stack, including tools for data ingestion, scraping, storage, computation and monitoring.

7:30 - 8:15 : "Building the Social Business Index" - Jeremy Hanna, Jacob Perkins and John De Oliveira from The Dachis Group - @jeromatron @thedatachef @johndeo

Jeremy, Jacob and John will be presenting on building data products and how they designed and built The Dachis Group's Social Business Index. The Social Business Index analyzes signals from over one hundred million social sources globally and analyzes the performance of the largest global companies and thousands of those companies' brands.Through the use of natural language processing, semantic analysis, and machine learning algorithms, Dachis Group has built a machine learning engine based on their pacesetting Social Business Design framework and leveraging their experience in hundreds of social engagements and executions as the world's largest social business strategy organization.

8:15 - 9:00 : " Scalable Data Pipelines" - Josh Wills, Director of Data Science, Cloudera - @josh_wills

Most of the interesting applications of Hadoop, from building machine learning models to populating business intelligence dashboards, involve running a series of dependent MapReduce jobs. Over the past year, a number of libraries for JVM languages have emerged that make it easy to create pipelines that are testable, maintainable, and scalable. In this talk, we'll walk through the process of building a data pipeline in Crunch, a Java/Scala library for building pipelines that operate on complex data types, covering everything from the initial choice of data format through testing, debugging, and scaling.

Location

Bazaarvoice is located here (Map)

When you arrive at Bazaarvoice, you can park in any of the open spots outside the building or in the garage next to the building. The meetup will be on the second floor. Take the elevator up to the 2nd floor and then follow the signs to the meeting room.

Sponsored By

Wednesday, February 29, 2012

2012 - March Meeting