Fact table retaining duplicate key rows with different time.day field
I have a fact table that's set to refresh based on a delete key. However, I noticed that the table is retaining multiple copies of that key as each processing occurs, where the only difference in the record is that the time.day field increments by one. Any idea why this is happening and how to resolve?
Answers
-
Hi Dan,
I just went through many Spaces of mine and I'm noticing that in "Space Management → View Published Data, Measure Data" ALL Dimensions are reflected, each in couple with Time.Day, even though many of them aren't in the grain of any Fact Table! I mean, I have many ADR Prepared Sources which are only used to populate Dimensions, with no Column marked as a Measure. Under the published Measure Data, there is even one line with "Time.Day" alone.
For all of these "phantom" facts, if I try "Preview Data" I get nothing (just "Select" as a Data query, nothing else).
Is this behavior expected? Is this some recent bug? (I cannot say whether this has always been the behavior, never put attention to this.) @Diane Mcelhiney Sorry for bothering, but do you have any clue here? Should I open a Case?
Regarding the "Time.Day" field in your post, I believe you are referring to "Time.Day ID". If I'm right, that's the ID of the Load Date. Apparently you're experiencing a scenario where the system is keeping each Snapshot. If you have set the Snapshot and Transaction Policy for your published Source as "Incremental refresh", this should not happen, to my understanding.
One thing that I do when I am in doubt is to check the Repository.xml. This is not possible directly (only Infor Support has access to Repository Management, unfortunately), but if you create an empty Space and you go to Compare and Merge and compare your Space to the empty one, you can figure out the structure of Repository.xml. I have just done so and I have seen that all Sources (they are actually called "Staging Tables" here) have a Property called "Transactional" (it's towards the end, after the definitions of the Columns), which gets set to "true" if the Snapshot and Transaction Policy for the Source (in Modeler Relate) is "Incremental refresh". Also, there is a property "SnapshotDeleteKeys" which is populated as an array with the selected Delete Keys in Relate.
By the way, one thing that happened to me multiple times in the past is that key Columns (i.e., Columns which are keys for a Dimension in the Grain of the Source) sometimes do not get marked as "NaturalKey" in the Repository.xml (i.e., there is a Property "NaturalKey" which normally goes to true, but sometimes is instead false), and it has always been the case that when this happens then the relative Surrogate Key in the corresponding Fact Table doesn't get populated (which creates a mess). Seems that de-selecting the Column from the Grain, saving the Source, and reverting back fixes the problem. One of these days I'm going to
cut Birst into little piecesopen a Case about this.Best,
Enrico
0 -
Are the multiple copies of the key being seen within the same load_id? When an incremental refresh with delete key is the transaction policy, rows matching the delete key not equal to the current load_id are deleted. The source extraction needs to be verified whether multiple keys used as the delete key exists in the load result set.
0 -
Yes, if you publish the same record multiple times (or, multiple records with the same delete key) in the same load, you get multiple records in the Fact Table. Right?
0 -
Yes, 'duplicates' within the same extract within the same load id. Delete Key Incremental Refresh transacts against rows loaded prior to the current publishing load id.
0 -
Birst Time Hierarchy is not exposed in Publish Data
Regarding the measure tables in the View Publish Data tab that have Time hierarchy and another hierarchy in the name are internal tables. It is not a bug as there is nothing wrong with respect to the underlying codebase. Being observant, should these tables even be shown in the tab? Possibly not and it is just UX aesthetics.
0 -
@Diane Mcelhiney @Enrico Menotti for further reference:
0 -
@Diane Mcelhiney @Enrico Menotti and there is only one row coming in from the source table:
0 -
We may have to talk about it off-line because there needs to be a deeper dive walk-thru with the data. I'd like to know if this Order Key when directly selected from within Data Fabric returns one row. What Scripted Source is creating the measure table? Is the output from the Scripted Source producing more than one row?
0 -
"Birst Time Hierarchy is not exposed in Publish Data"
… Please pardon me but I don't understand what you mean here exactly…
0 -
The output of the scripted source is shown here:
This is the script that generates that fact table, so you can see that only one record with that key exists.
Unfortunately, not sure we can check in data fabric as I don't believe gov cloud has access.
If you did want to have a quick chat offline to review I would appreciate any insight! Thanks
0 -
@Dan_Rosen Maybe I say some nonsense, but did you try with a different separator than the colon (':')?
Another thing I'd check are the publishing logs.
And, in case you sort it out with Diane in the background, would you be so kind to post here the outcome? Thanks!
0 -
I haven't as I've been using that as my standard separator for a while. This is my first time setting up an incremental load space with this type of key so it's a valid thought, though I believe I've seen other people use this as well without issues. Yes, if anything gets sorted offline, will definitely report back
0