Hi Ryan, I was curious if you can add me to GCP as well so I can view some of the things we reviewed today in BigQuery?
Done, let me know you can get https://console.cloud.google.com/bigquery?orgonly=true&project=tort-intake-professionals&supportedpurview=project&ws=!1m0|here:
Ok great, yep I can access and query the tables.
Ah, I also just realized I will need access to the github in order to check out a branch.
My github username is chrisg-bytecodeio if you can add me to the repo as a developer.
https://github.com/SL-BI/tip_project/invitations
Good afternoon Ryan, and happy Friday. Would you be available sometime Monday afternoon to review the updates I made for setting up the snapshot as well as some dbt best practices?
We could set it for 30 minutes or an hour but an hour would allow us some time to discuss everything in detail. And I can record the meeting for you to review later as well.
Letβs do Tuesday, my Monday is slammed
https://calendar.app.google/HKTuVhLVGZdk7d2W7
I scheduled it for 1, feel free to revise that to 1 hour if you like or else I can condense everything into 30 minutes.
Hi Ryan I think to fix this we will need to make staging the default branch in Github and clean it all up. Can you add me as another admin to in the github repo temporarily?
I can change that real quick
Ok I have it all set up if you'd like to hop back on for a minute
Hey Ryan sorry I lost your audio there at the end, but I will follow up with you tomorrow.
@deleted-U0410U6Q8J3
how do I exclude rows where _fivetran_deleted is TRUE in this DBT model?
{{ dbt_utils.star(from = source("aws_legagcy_public", "financial_log")) }}
from {{ source('aws_legagcy_public', 'financial_log') }
You would just add a where clause after from, same as a normal sql query
@deleted-U0410U6Q8J3, how you coming? Can we meet Wednesday to review this first SL project step for DBT tables?
Hi @Ryan, thanks for reaching out. Yes I should have this wrapped up end of day today, and Wednesday sounds good to meet. When would be a good time for you?
Does 2PM CT work? Kinda a crazy day but I can get an hour with you.
Looks like I am free anytime after 1:00 pm CST except 2:30_3:00, and I'll be online til 6pm CST.
Morning, please add @Brian Hirst to the call today. Ty.
Hey Ryan, I should have everything done that we discussed by my end of day today, would you like to have that follow up meeting tomorrow where we try to switch it over in Looker?
Lets do 11AM PT on Tuesday next week, I'm just swamped with some financials and our AI product demo tomorrow.
Sure, sounds good. I just sent the invite to both you and Brian.
Hey Chris, I have some orchestration issues on our jobs. We need to clean up. Iβm traveling tomorrow, can we meet Friday?
Hey Ryan, sure. I can be free anytime from 8 am - 11 am CST, or else 12 pm - 2 pm CST. Just let me know what works for your schedule.
@deleted-U0410U6Q8J3 , need to push to Monday. What times?
Hi Ryan, sure I am free anytime before 11 am CST, or else anytime after 3pm CST.
Good morning Ryan, I realized today's meeting it in conflict with another meeting. Are you free any other times today? (I'm only unavailable from 12-1 pm CST.)
Hi Ryan, hope you are doing well. I wanted to follow up to let you know, I have updated the generateschemaname macro to use a qa_ prefix for the STG builds for both the SL and TIP projects.
I think regarding your desire to have a multi-step merge process for production code, something like
dev branch > team_dev > staging > prod or dev branch > staging > team_dev > prod
I think you can simplify it into just having the process be: dev branch > staging > production, and you would do so via restricting who can merge staging to production to maybe just yourself or whoever owns the development lifecycle. This has worked well for our other clients.
We can accomplish this with an updated configuration in Github, where we restrict who can merge to the subsequent branches and the production branch via a codeowners configuration.
I'd be happy to meet with you sometime this week if you'd like me to explain this some more in detail and we can walk through the updates needed to configure this change. For now, I would go ahead and create a codeowners team (you can just add yourself for now if you like) within your Github organization, and then we can follow up for next steps and testing this functionality.
Got it, @deleted-U0410U6Q8J3, lets do your recommended dev branch > staging > production , You can add my rvaspraml GitHub user as a "Code Owners" team and get GitHub ready for tomorrow? Doing this for SL BI Project and TIP BI Projects, so when you free tomorrow or Wednesday to review? Preferably tomorrow, as I have a SL client data store I am working on and don't want to overlap your clean up and mess it up.
Sure that sounds good, I can be free Tuesday or Wednesday, whenever works best for you if you'd like to send the invite.
I will follow up on setting this up today, and I'll let you know if I need anything further in the interim.
Thank you. Will send invite soon.
I realized I have access to the relevant repos, but I am not an organization admin for your Github, so you will have to set up the team in Github.
If you can do so ahead of the meeting, create a team named code-owners, and add yourself only to it. (If you have any issues with this we can take care of it on the call tomorrow.)
Then on tomorrow's call we will configure the repos to only allow you to merge to the production branch.
Hi Ryan I am really sorry I mixed up the invitations and thought this was rescheduled for 10/11; saw your subsequent invite for 10/14.
All good. Enjoy your OOO
Hey Ryan, saw your follow up invite... I am booked unfortunately at 1-1:30pm CT, but I am free anytime after 2:30 pm CT, or also 10:00_11:00 am CT. Would one of those times be good?
@deleted-U0410U6Q8J3, you free for 5 minutes? I'm stuck in DBT trying to create a simple new model
Hey Ryan, sure I am free for the next 20 minutes. Happy to hop on.
Hey Ryan it looks like the ryandev branch is containing the newer version of everything along with the older version of everything. Would it be ok if I reset your branch to the state of the staging branch and we start fresh and implement the new lrdata model?
Not sure how its got both old and new so its probably best to just get it back to whats current first.
The reason that model was erroring is because the lr_inbox data was ingested to lr_data which is in us-west4; if you ingest it to the US multiregion it will work the same as the others I believe.
If you want to go ahead and update that, we can then update the dbt development branches to be cleaned up first, then test adding that additional model.
Yes that is the only dataset in us-west4, we should align it with the others in US region
How was it created the first time? Via a an etl tool or manually created?
Yeah I don't think so, you can override the location in a yaml configuration but not at the dataset level, its at the whole project level.
We'd need to have him update the ingestion to a US location dataset
I could copy that data temporarily into another dataset but its not a permanent fix
@deleted-U0410U6Q8J3, now, what about my DBT login?
Yes let me go ahead and reset the branch now, just a moment and I'll follow up.
Great, will advise on the data location issue. Also, if I create a "new table" from BigQuery for the iolrstatusesservices table from that lrinbox table, and ensure it's in US location, that might work?
@deleted-U0410U6Q8J3, how are the PK Snapshot fixes coming? I am getting snapshot error on Build Jobs in DBT in TIP project.
Good morning @Ryan, yes I went ahead and moved that update into your ryan_dev branch, I haven't committed it in case you would like to look at it. Sorry for the delay, it is my birthday today so I just was checking Slack and email. If the current state looks good to go you probably can go ahead and commit and merge it.
Essentially what we can do is add a post_hook SQL DML statement into the model config for the snapshot itself. So everytime the snapshot is taken and the output table is built, it is adding the id field as the primary key via an ALTER TABLE statement.
BigQuery doesn't enforce primary keys of course like an RDBMS, but it does have beneficial performance implications on joins for having primary keys declared. We can add some evaluation for ensuring primary keys exist on tables where they are supposed to be declared in BigQuery if you think that also is helpful.
Happy birthday, can we set up a time tomorrow or Friday to review it so I clearly understand?
Sure, yeah that sounds good. Whatever time works for you.
@deleted-U0410U6Q8J3,
Good morning. DBT for the TIP project is failing on occasion, can you investigate. https://xa302.us1.dbt.com/deploy/251196/projects/357639/runs/70471824444898
17:07:17 Completed with 1 error and 0 warnings:
17:07:17
17:07:17
17:07:17 Database Error in snapshot dbt_lr_case_types_revenue_rates_snapshot (snapshots/dbt_lr_case_types_revenue_rates_snapshot.sql)
UPDATE/MERGE must match at most one source row for each target row
compiled Code at target/run/tip_dbt/snapshots/dbt_lr_case_types_revenue_rates_snapshot.sql
17:07:17
17:07:17
*Thread Reply:* Good morning Ryan, following up on this, it looks like the snapshot (dbt_lr_case_types_revenue_rates_snapshot.sql) is failing because the columns in the underlying table (dbt_lr_case_types_revenue_rates.sql) has changed (missing some fields that used to be in it) and so there's an incompatibility with what already is there in the snapshot. When the snapshot model is run, it goes to compare the existing snapshot to the table's new data, but finds a different shaped table. So it throws the error.
Resolution for this would be to revert the model to the old structure, or to wipe the snapshot and start it again. But in the future to maintain integrity we will want to keep that model as it is.
*Thread Reply:* @deleted-U0410U6Q8J3, we added some columns to the table for application purposes, see below. So I need to update the Raw Data models and snapshot and start again by deleting the snapshot from BigQuery?
*Thread Reply:* Yes thats right, and in the future to make it a bit more flexible we could add a model in between it and the snapshot, which only passes through the needed fields, so maintenance of the base table won't break the ongoing snapshot.
*Thread Reply:* We could also do that now instead of deleting the existing snapshot, unless you want the new fields in the snapshot.
*Thread Reply:* @deleted-U0410U6Q8J3, think I fixed it but shouldn't the "staging" environment have it's own version of the snapshot? I noticed we are "fixing the BigQuery dataset" in the snapshot sql model in DBT
It did it, then stopped and worked. So something in orchestration is my guess
Meaning the snapshot is running into some scenario we havenβt handled.