How do I import historical data? Documentation
When transitioning over to Segment customers commonly want to import historical data to tools they are migrating to or evaluating.
Note: Historical imports can only be done into destinations that can accept historical timestamp’ed data. Most analytics tools like Mixpanel, Amplitude, Kissmetrics, etc. can handle that type of data just fine. One common destination that does not accept historical data is Google Analytics since their API cannot accept historical data.
Use any server-side library, which send requests in batches to improve performance. Once you have data to import, follow the steps below:
Export or collect the data to be imported.
Include timestamp data in your export if the data needs to appear in end tools in a historical reference. For instance if you are importing emails and it is relevant when they joined your email list, you may need to export the timestamp. If no timestamp is specified when importing, the data will show a timestamp from the time the data was received
Decide which destinations need to receive the data.
By default, data coming into Segment will be relayed to all destinations linked to a given source. To limit data to specific destinations, the
integrations
object must be modified. With historical data, you often only want to send the data to a specific destination or into your data warehouse. For example, inintegrations
object as follows.
analytics.track({
event: 'Upgraded Membership',
userId: '97234974',
integrations: { 'All': false, 'Vero': true, 'Google Analytics': false }
})
Once you’ve done that, you’ll need to write an application or worker to send the data to Segment.
You will need to cycle through each set of data and map it to a Segment server-side library method or build an array matching the HTTP Import API format. Please note, we recommend using a Segment library for this process, as they will set contextual message fields like message_id
(used for deduping) and sent_at
(used for correctly client clock skew) that our API will use to ensure correct behavior upon ingestion. The server-side libraries will automatically batch requests to optimize for performance and prevent linear request volume. This batching behavior is modifiable. Some of the libraries implement a configurable max queue size that may discard messages if you enqueue requests much faster than the client can flush them. We recommend overriding the max queue size parameter for the library to a high value you’re comfortable you can remain under in your batch job.
One of our Success Engineers wrote an alpha prototype Node.js app for importing data utilizing the HTTP API, which we’ve included below:
If a server-side library doesn’t meet your needs, use the Segment bulk import HTTP API. Please note, if you’re using the HTTP API directly to replay data you’ve exported from Segment, we recommend removing the original sent_at
, message_id
, and project_id
fields from the archived message before forwarding them to Segment.
Our friends at MarketLytics have written up their experience using the alpha prototype importer and offer some helpful visuals and tips.
If you have any questions, or see anywhere we can improve our documentation, please let us know!
相關推薦
How do I import historical data? Documentation
When transitioning over to Segment customers commonly want to import historical data to tools they are migrating to or evaluating.Note: Historical imports
How do I join user profiles? Documentation
One of the first questions we get when our customers start querying all of their data is, how do I join all this data together? For example, let’s say you’
How do I find out my usage data? Documentation
If you have questions about your data usage or how it relates to your bill, we recommend logging into your Segment workspace, clicking on the top left arro
How do I measure my advertising funnel? Documentation
However, it’s surprisingly hard to answer questions about the ROI of your ad campaigns, and many technical marketers aren’t able to dig into the numbers wi
How do I handle common cloud source errors? Documentation
The most common reasons why sources will have trouble is due to authentication or permission issues. When the issue is authentication-related, you'll see a
How do I pick a secure password? Documentation
Picking a strong password is one of the most important things you can do to protect your account.Under the HoodWhen you first create a Segment account, or
How do I add a team member? Documentation
If you are on our Team or Business plan you can add a Team member in your workspace team page and inviting any team members by email. If you are on a Devel
How do I decide between Redshift, Postgres, and BigQuery? Documentation
Comparing Redshift and PostgresIn most cases, you will get a much better price-to-performance ratio with Redshift for typical analyses.Redshift lacks some
How do I find my source slug? Documentation
Your source slug can be found in the URL when you’re looking at the source destinations page or live debugger. The URL structure will look like this:If you
How do I find my write key? Documentation
The write key is a unique identifier for your Source. It lets Segment know which Source is sending the data and therefore which destinations should receive
How do I measure the ROI of my Marketing Campaigns? Documentation
The purpose of marketing campaigns is to drive traffic to your store front. But how do you know which campaigns yield the most conversions or what channel
How do I find what queries were executing in a SQL memory dump?-----stack
been sea under lba bject ecif tool data- mil https://blogs.msdn.microsoft.com/askjay/2010/10/03/how-do-i-find-what-queries-were-execu
ubuntu How do I configure proxies without GUI?
cli pri art lar open config user settings 修改 想法: 我的想法是想是一臺國內的 ubuntu 雲主機可以通過另外一臺在國外(新加坡)的服務器 ,來實現可以訪問 google ,哈哈,比較好查資料:) 下面的做法 去修改 /et
How do I clone a generic list in C#?
code sele listt list ati class ocl list() () static class Extensions { public static IList<T> Clone<T>(this IList<T>
how do I access the EC Embedded Controller firmware level with wmi win32?
Imports System Imports System.Management Imports System.Windows.Forms Namespace WMISample Public Class MyWMIQuery Public Overloads Shar
How do I add a Foreign Key Field to a ModelForm in Django?
What I would like to do is to display a single form that lets the user: Enter a document title (from Document model
Dlib how do I save image
解決c++ - In Dlib how do I save image with overlay? 推薦:how to save a c++ object in java object and use it http://blog.csdn.net/luoshen
How do I interpret scsi status messages in RHEL like "sd 2:0:0:243: SCSI error: return code = 0x0800
Issue What does "return code = 0xNNNNNNNN" mean, for example 0x08000002 within the following: Raw Oct 23 14:56:25 uname kernel: sdas: C
【轉】How do I set the real time scheduling priority of a process?
In the event that a process is not achieving the desired performance performance benchmarks, it can be helpful to set CPU affinity, real
How do I resize an image using PIL and maintain its aspect ratio?
我有一個資料夾,裡的圖檔是 96×96, 我希望在這一個資料夾下的檔案被異動時,會自動產生縮圖(64×64)到其他的資料夾下。 PIL 是 Python 下最有名的影像處理套件。 這個套件,似乎在升級改版本,把一些比較少人用的屬性或方法在新版本裡拿掉,新版本也加入了更多新的功能。一般人應該都只會使用基本