How do cloud sources work? Documentation
Sources are functionally comprised of either one or both of the following components: a "sync" component and a "streaming" component. They work together to populate logical collections of data based on upstream resource availability and following data normalization best practices. These collections may be either events (append only data streams, akin to "facts" in data warehousing parlance) or objects (dimensional values that may be updated based on changes in state upstream).
Sync
When you enable a source and grant us access by pasting an API key or authenticating with OAuth, we begin running a scheduled job on your behalf which makes requests to the downstream tool, normalizes and transforms the data, and forwards the data to our API. We make an effort to use as few API calls as possible, opting to fetch only data that has changed since the previous sync where possible. This can be an intensive process, especially on first sync, so we have lots of affordances in place for retries and to respect rate limits imposed by the partner.
API Call Usage and Collection Selection
We make an effort to be respectful of your API call allotments and limits. For example, in the case of Salesforce, we issue only one query per collection per run, using the absolute minimum number of API calls possible (typically about 350/day).
Moreover, we're deliberate about which collections we pull, striking a balance between allowing you to get a full picture of your users and reducing extraneous data (like administrative and metadata tables).
Soon, we'll allow you to specify which collections you care about during the source setup phase, so if you need to cut down on calls, you'll be able to just deselect collections.
Streaming
Streaming components are used to listen in real time to webhooks from downstream cloud sources, normalize and transform the data, and forward it to our APIs.
Both sync and streaming components can forward data to our event tracking and objects upsertion API processing layers, but generally sync components are used to fetch objects and streaming components listen for events.
If you have any questions, or see anywhere we can improve our documentation, please let us know!
相關推薦
How do cloud sources work? Documentation
Sources are functionally comprised of either one or both of the following components: a "sync" component and a "streaming" component. They work together to
How Do Neural Networks Work?
How Do Neural Networks Work?When you first look at neural networks, they seem mysterious. While there is an intuitive way to understand linear models and d
How do you add users? Documentation
If you have more than one person working with your Segment Warehouse, you might want to create users for your team so that each person can have a discrete
How do I handle common cloud source errors? Documentation
The most common reasons why sources will have trouble is due to authentication or permission issues. When the issue is authentication-related, you'll see a
Ask HN: How do you feel about having a mentor in your line of work?
Hello HN-ers I've been in the software industry for about five years and have worked as an individual contributor at companies big and small. Over the year
Ask HN: How do you tackle your Cloud Cost Management?
Hello Community, We are a team of engineers and product guys, that have built SaaS products in the past and got bitten too many times with cloud cost surpr
Ask HN: How do you decide when you've done enough work for the day?
I'm a relatively junior software engineer, a little over a year out from university, with a cushy remote job working for big-co. And I never know how much
What can you do with cloud source data? Documentation
What kind of data do you pull from each source?In general, we’ve focused on pulling all of the collections directly related to the customer experience. We
How do I measure my advertising funnel? Documentation
However, it’s surprisingly hard to answer questions about the ROI of your ad campaigns, and many technical marketers aren’t able to dig into the numbers wi
What are object cloud sources? Documentation
To use object cloud sources (Salesforce, Zendesk, Stripe, etc.), you must also have a warehouse setup in your Segment account. In the app data from website
How do I pick a secure password? Documentation
Picking a strong password is one of the most important things you can do to protect your account.Under the HoodWhen you first create a Segment account, or
How do I add a team member? Documentation
If you are on our Team or Business plan you can add a Team member in your workspace team page and inviting any team members by email. If you are on a Devel
How do we implement an e-commerce tracking plan? Documentation
When tracking your data, it’s important to set yourself up for success. E-commerce and retail companies want to use their data to understand why some custo
How do I join user profiles? Documentation
One of the first questions we get when our customers start querying all of their data is, how do I join all this data together? For example, let’s say you’
How do I find out my usage data? Documentation
If you have questions about your data usage or how it relates to your bill, we recommend logging into your Segment workspace, clicking on the top left arro
How do I decide between Redshift, Postgres, and BigQuery? Documentation
Comparing Redshift and PostgresIn most cases, you will get a much better price-to-performance ratio with Redshift for typical analyses.Redshift lacks some
How do I find my source slug? Documentation
Your source slug can be found in the URL when you’re looking at the source destinations page or live debugger. The URL structure will look like this:If you
How do I find my write key? Documentation
The write key is a unique identifier for your Source. It lets Segment know which Source is sending the data and therefore which destinations should receive
How do I measure the ROI of my Marketing Campaigns? Documentation
The purpose of marketing campaigns is to drive traffic to your store front. But how do you know which campaigns yield the most conversions or what channel
How do I import historical data? Documentation
When transitioning over to Segment customers commonly want to import historical data to tools they are migrating to or evaluating.Note: Historical imports