We are using the ODK 2 tools in the field, and collecting lots of great data! One of the most common challenges in the field has been around conflicting data. We have data moving across teams and within teams with multiple individuals, as well as with centralized admin. Unsurprisingly, conflicts arise when syncing. It would be great for our data management purposes if there were a variable that tracked when a field worker chose to take their local version when there was a conflict, as it would be much easier to sort out conflicting data over time. Thanks!
Thanks for the suggestion, Caroline! While we consider this feature, I have a few suggestions that may help you.
We have a diff URI that shows the user responsible for the most recent changes. You can get to the diff URI by entering the URI for your server followed by “odktables”, your app name (this is typically “default”), and finally “tables”. An example URI would be the following:
You may be asked to authenticate with a valid user id at this point. Then you should be presented with a list of tables and their corresponding diff URIs (as well as other URIs). An example of a diff URI would be:
More information about the URIs can be found at https://docs.opendatakit.org/odk2/odk-2-sync-protocol/. With the diff URI, you can find the relevant row and get the “lastUpdateUser”.
Another way to discover the user responsible for a change to a row would be to use the App Engine console and look at the log table. To access the console, go to https://console.cloud.google.com/ and login with the username you used to create your App Engine instance. Be sure to select the relevant project for your App Engine instance. On the left pane, choose “Datastore” and then “Entities”. From the dropdown, select the log table. The format for the log table name will be:
For example, the log table name for the geotagger table would be:
Here you will be able to see all changes to all rows along with who was responsible for the change.
I hope this helps! Let me know if you have questions.
Thanks for the suggestions Clarice! So we had already figured out the ability to fish out who most recently updated the data (_last_update_user) with downloading with Suitcase and checking the Extra metadata columns box (this is an awesome feature!). The issue is that we are at a fairly sizeable scale (tens of thousands) and so it would be super helpful to know which data had included a conflict and an update specifically, because that will need much closer review, or may explain why we get inconsistent results across sub-forms, etc. So the issue is not so much who changed as that the change involved a sync conflict and over-writing data. Is there any way to tell that from any of the methods you mentioned? Thanks!
I’m glad to hear that Suitcase came in handy!
Thanks for explaining your use case in more detail. Unfortunately, there is not a nice way to do this. If you went the App Engine console, you would be able to see all the changes that a row went through, but you would have to manually go through all the different versions to see what changed. I definitely see your need for this feature and will communicate this to the team.