A library for reading social data from Facebook using Spark Streaming.
Run a demo via:
# set up all the requisite environment variables
# you can create a new app id and secret here: https://developers.facebook.com/quickstarts/
# you can generate a new auth token here: https://developers.facebook.com/tools/accesstoken/
export FACEBOOK_APP_ID="..."
export FACEBOOK_APP_SECRET="..."
export FACEBOOK_AUTH_TOKEN="..."
# compile scala, run tests, build fat jar
sbt assembly
# run locally
java -cp target/scala-2.11/streaming-facebook-assembly-0.0.3.jar FacebookDemo standalone
# run on spark
spark-submit --class FacebookDemo --master local[2] target/scala-2.11/streaming-facebook-assembly-0.0.3.jar spark
Facebook doesn't expose a firehose API so we resort to polling. The FacebookReceiver pings the Facebook API every few seconds and pushes any new posts into Spark Streaming for further processing.
Currently, the following ways to read Facebook items are supported:
- by page (sample data)
- comments for page (sample data)
- Configure your credentials via the
SONATYPE_USER
andSONATYPE_PASSWORD
environment variables. - Update
version.sbt
- Enter the SBT shell:
sbt
- Run
sonatypeOpen "enter staging description here"
- Run
publishSigned
- Run
sonatypeRelease