Adobe Analytics Data feeds are a means to get raw data out of Adobe Analytics.
This project implements an Apache Spark data source leveraging uniVocity TSV Parser and does not suffer from the flaws found in many online examples which treat the (hit)data files as CSV. Concretly, escaped values are not handled correctly by a CSV parser due to inherent differences between CSV and TSV.
- Correct handling of records which contain special characters
- Lookup values are replaced with their actual value in the Lookup files
- Dynamic lookups are supported as well
- Events are parsed as array of (key, value)
- Products are parsed as product with name, category, quantity, price, events and evars.
- Capability to filter found manifest files through:
All available options are here: DatafeedOptions.scala
Make sure the package is in the classpath, eg: by using the --packages option:
spark-shell --packages "be.icteam:adobe-analytics-datafeed-datasource_2.12:$version"
And you can read the feed as following:
val df = spark.read
.format("be.icteam.adobe.analytics.datafeed")
.load("./src/test/resources/randyzwitch")
Here is what it looks like:
df.show(3, false)
+------------------------------------------------------+----------------------------------+------------------+------------+---------------+------------------------+----------+------------------------+-----------+-----------+----------------+--------------------+-------------------+---------------+----------+-----+-----+-----+-----+-----+-----+-----+-----+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+-----------+------------------+-------------------------------------------------------------------+--------------------------+------------------+---------+-----------+-------+----------+--------+--------------+-----------------+-----------------+----------------------+---------+-------------------+------------------+-------------+------------+------------+-------------+----------------------+----------+----------+----------+----------+----------+----------+----------+----------+----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------+----------+----------+----------+----------+-------------+---------------+--------------------+--------------------+--------------------+---------------------------------------------------------------------+--------------------+--------------+---------------------------------------------------------------------+----------------------+--------------------------------------------------------------------+----------+----------+-------------+-----------------+-----------------------------------------------------------+------------+----------+----------+-----------+-----------+------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+---------------+----------------------------------------------------------------------+------------------+----------+-------------------+-------------------+---------+----------+-------------+-------+-------------------------------------------------------------------------------------------------------------------------+--------------+---------+--------------+-------------------------------------+-------------------+--------------------+-------------------------------------------------------------------+--------------------+
|post_event_list |post_product_list |browser |browser_type|connection_type|country |javascript|language |os |resolution |ref_type |accept_language |date_time |domain |evar1 |evar2|evar3|evar4|evar5|evar6|evar7|evar8|evar9|evar10|evar11|evar12|evar13|evar14|evar15|evar16|evar17|evar18|evar19|evar20|evar21|evar22|evar23|evar24|evar25|evar26|evar27|evar28|evar29|evar30|evar31|evar32|evar33|evar34|evar35|evar36|evar37|evar38|evar39|evar40|evar41|evar42|evar43|evar44|evar45|evar46|evar47|evar48|evar49|evar50|evar51|evar52|evar53|evar54|evar55|evar56|evar57|evar58|evar59|evar60|evar61|evar62|evar63|evar64|evar65|evar66|evar67|evar68|evar69|evar70|evar71|evar72|evar73|evar74|evar75|exclude_hit|first_hit_pagename|first_hit_page_url |first_hit_referrer |first_hit_time_gmt|geo_city |geo_country|geo_dma|geo_region|geo_zip |ip |last_hit_time_gmt|last_purchase_num|last_purchase_time_gmt|new_visit|post_browser_height|post_browser_width|post_campaign|post_channel|post_cookies|post_currency|post_cust_hit_time_gmt|post_evar1|post_evar2|post_evar3|post_evar4|post_evar5|post_evar6|post_evar7|post_evar8|post_evar9|post_evar10|post_evar11|post_evar12|post_evar13|post_evar14|post_evar15|post_evar16|post_evar17|post_evar18|post_evar19|post_evar20|post_evar21|post_evar22|post_evar23|post_evar24|post_evar25|post_evar26|post_evar27|post_evar28|post_evar29|post_evar30|post_evar31|post_evar32|post_evar33|post_evar34|post_evar35|post_evar36|post_evar37|post_evar38|post_evar39|post_evar40|post_evar41|post_evar42|post_evar43|post_evar44|post_evar45|post_evar46|post_evar47|post_evar48|post_evar49|post_evar50|post_evar51|post_evar52|post_evar53|post_evar54|post_evar55|post_evar56|post_evar57|post_evar58|post_evar59|post_evar60|post_evar61|post_evar62|post_evar63|post_evar64|post_evar65|post_evar66|post_evar67|post_evar68|post_evar69|post_evar70|post_evar71|post_evar72|post_evar73|post_evar74|post_evar75|post_hier1|post_hier2|post_hier3|post_hier4|post_hier5|post_keywords|post_page_event|post_page_event_var1|post_page_event_var2|post_page_event_var3|post_pagename |post_pagename_no_url|post_page_type|post_page_url |post_persistent_cookie|post_prop1 |post_prop2|post_prop3|post_prop4 |post_prop5 |post_prop6 |post_prop7 |post_prop8|post_prop9|post_prop10|post_prop11|post_prop12 |post_prop13|post_prop14|post_prop15|post_prop16|post_prop17|post_prop18|post_prop19|post_prop20|post_prop21|post_prop22|post_prop23|post_prop24|post_prop25|post_prop26|post_prop27|post_prop28|post_prop29|post_prop30|post_prop31|post_prop32|post_prop33|post_prop34|post_prop35|post_prop36|post_prop37|post_prop38|post_prop39|post_prop40|post_prop41|post_prop42|post_prop43|post_prop44|post_prop45|post_prop46|post_prop47|post_prop48|post_prop49|post_prop50|post_prop51|post_prop52|post_prop53|post_prop54|post_prop55|post_prop56|post_prop57|post_prop58|post_prop59|post_prop60|post_prop61|post_prop62|post_prop63|post_prop64|post_prop65|post_prop66|post_prop67|post_prop68|post_prop69|post_prop70|post_prop71|post_prop72|post_prop73|post_prop74|post_prop75|post_purchaseid|post_referrer |post_search_engine|post_state|post_visid_high |post_visid_low |post_zip |prev_page |ref_domain |service|user_agent |visit_keywords|visit_num|visit_page_num|visit_referrer |visit_search_engine|visit_start_pagename|visit_start_page_url |visit_start_time_gmt|
+------------------------------------------------------+----------------------------------+------------------+------------+---------------+------------------------+----------+------------------------+-----------+-----------+----------------+--------------------+-------------------+---------------+----------+-----+-----+-----+-----+-----+-----+-----+-----+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+-----------+------------------+-------------------------------------------------------------------+--------------------------+------------------+---------+-----------+-------+----------+--------+--------------+-----------------+-----------------+----------------------+---------+-------------------+------------------+-------------+------------+------------+-------------+----------------------+----------+----------+----------+----------+----------+----------+----------+----------+----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------+----------+----------+----------+----------+-------------+---------------+--------------------+--------------------+--------------------+---------------------------------------------------------------------+--------------------+--------------+---------------------------------------------------------------------+----------------------+--------------------------------------------------------------------+----------+----------+-------------+-----------------+-----------------------------------------------------------+------------+----------+----------+-----------+-----------+------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+---------------+----------------------------------------------------------------------+------------------+----------+-------------------+-------------------+---------+----------+-------------+-------+-------------------------------------------------------------------------------------------------------------------------+--------------+---------+--------------+-------------------------------------+-------------------+--------------------+-------------------------------------------------------------------+--------------------+
|[{Instance of eVar1, null}, {Instance of eVar2, null}]|[{null, , null, null, null, null}]|Safari 7.1 |Apple |LAN/Wifi |Commercial (mostly U.S.)|1.6 |English (United States) |OS X 10.9.5|1400 x 864 |Search Engines |en-us |2015-07-13 00:26:18|netvigator.com |logged-out|guest|null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |0 |null |http://randyzwitch.com/broken-macbook-pro-hinge-fixed-free/ |https://www.google.com.hk/|1436761578 |hong kong|hkg |0 |no region |0 |219.77.75.182 |0 |0 |0 |1 |687 |1347 |null |null |Y |USD |1436761578 |logged-out|guest |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |::empty:: |0 |null |null |null |http://randyzwitch.com/broken-macbook-pro-hinge-fixed-free |null |null |http://randyzwitch.com/broken-macbook-pro-hinge-fixed-free |Y |Broken MacBook Pro Hinge? Apple will fix for free! | randyzwitch.com|1173 |post |single-post |technology |apple,customer-service,genius-bar,macbook-pro |Randy Zwitch|1 |2012 |06 |25 |June 25, 2012 |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |https://www.google.com.hk/ |557 |null |2791471528899189638|791228704714081521 |::hash::0|0 |google.com.hk|ss |Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.1.17 (KHTML, like Gecko) Version/7.1 Safari/537.85.10 |::empty:: |1 |1 |https://www.google.com.hk/ |557 |null |http://randyzwitch.com/broken-macbook-pro-hinge-fixed-free/ |1436761578 |
|[{Instance of eVar1, null}, {Instance of eVar2, null}]|[{null, , null, null, null, null}]|Google Chrome 43.0|Google |LAN/Wifi |Japan |1.6 |English (United States) |Windows 8.1|1280 x 800 |Search Engines |en-US,en;q=0.8 |2015-07-13 00:56:09|aist.go.jp |logged-out|guest|null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |0 |null |http://randyzwitch.com/rsitecatalyst-website-pathing-sankey-charts/|https://www.google.com/ |1436426719 |tsukuba |jpn |0 |08 |305-0005|150.29.149.177|1436754129 |0 |0 |1 |777 |1293 |null |null |Y |USD |1436763369 |logged-out|guest |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |::empty:: |0 |null |null |null |http://randyzwitch.com/rsitecatalyst-website-pathing-sankey-charts |null |null |http://randyzwitch.com/rsitecatalyst-website-pathing-sankey-charts |Y |Visualizing Website Pathing With Sankey Charts |3047 |post |single-post |digital-analytics|adobe-analytics,data-visualization,omniture,r,rsitecatalyst|Randy Zwitch|1 |2014 |09 |10 |September 10, 2014|7 |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |https://www.google.com/ |57 |null |3037297388874966800|6917530475045353754|::hash::0|0 |google.com |ss |Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.132 Safari/537.36 |::empty:: |4 |1 |https://www.google.com/ |57 |null |http://randyzwitch.com/rsitecatalyst-website-pathing-sankey-charts/|1436763369 |
|[{Instance of eVar1, null}, {Instance of eVar2, null}]|[{null, , null, null, null, null}]|Google Chrome 43.0|Google |LAN/Wifi |Network (mostly U.S.) |1.6 |English (United States) |OS X 10.10 |1280 x 800 |Search Engines |en-US,en;q=0.8 |2015-07-13 00:48:36|comcast.net |logged-out|guest|null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |0 |null |http://randyzwitch.com/hive-five-hard-won-lessons/ |https://www.google.com/ |1435962984 |san jose |usa |807 |ca |95126 |50.136.222.167|1436200856 |0 |0 |1 |777 |1197 |null |null |Y |USD |1436762916 |logged-out|guest |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |::empty:: |0 |null |null |null |http://randyzwitch.com/hive-five-hard-won-lessons |null |null |http://randyzwitch.com/hive-five-hard-won-lessons |Y |Five Hard-Won Lessons Using Hive | randyzwitch.com |2680 |post |single-post |data-science |big-data,hadoop,hive,python,r |Randy Zwitch|1 |2014 |06 |12 |June 12, 2014 |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |null |https://www.google.com/ |57 |null |3083707027358817578|6917535643501355093|::hash::0|0 |google.com |ss |Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.132 Safari/537.36|::empty:: |4 |1 |https://www.google.com/ |57 |null |http://randyzwitch.com/hive-five-hard-won-lessons/ |1436762916 |
val df = spark.read
.format("be.icteam.adobe.analytics.datafeed")
.option(ClickstreamOptions.MODIFIED_AFTER, checkpoint)
.load("s3://bucket/landing/feed")
df.write.format("delta").save("s3://bucket/conformed/feed")
Publish your own version in your local m2 repository:
sbt publishM2
This project leverages sbt-ci-release to create and publish to Sonatype and Maven Central from GitHub Actions.
Create and push the appropiate tag (vX.Y.Z) and ci.yml will make sure a release is built
git tag v0.1.0
git push --tags