Using Suitcase to add and update observations?

elmps2018 · April 7, 2018, 4:57pm

Hi All!

We are using the ODK 2.0 suite, and we are going to have to update our survey database regularly as we are in the field. For example, when members of a household split over time, we need to create new households for them, assign them to the right team, etc. We have the process for this all set up in STATA to take a finalized household and identify any splits, assign them an id, and generate a new household. Or we have to systematically reassign households that have moved to a different team (we are doing this at a fairly large scale, so cannot just edit one by one in tables)–we can do this in STATA as well, systematically, based on addresses.

STATA is successfully generating csvs with all the right data to then add to our database (Aggregate 1.4.15) for new households and modified households. What we have been doing is re-pushing to a tablet, importing the revised csv and then syncing. This is a rather sub-optimal process for updating.

Having been playing around with downloading in suitcase, it looks like there is also a possibility to upload in suitcase, but alas, no documentation. I worked through a first error message which implied that the uploaded structure needed to include sub-folders of assets and tables, and got to the point of “Uploading…” but then the message “An error occurred.”

I know there was a pending documentation item on uploading in suitcase:

github.com/odk-x/tool-suite-X

Provide documentation for Suitcase Upload

opened 08:28PM - 03 Oct 17 UTC

closed 06:21AM - 30 Mar 21 UTC

jbeorse

good first issue Documentation Suitcase Medium-Priority easy win

#### Software and hardware versions ODK Suitcase 2.0.0 #### Problem descrip…tion Suitcase documentation doesn't cover uploading and resetting the app server: http://opendatakit-dev.cs.washington.edu/2_0_tools/release/214/suitcase Furthermore the CLI help descriptions are quite brief and sometimes confusing. These should be updated and the documentation linked above should be updated to be more thorough and clear. Also, I think the dataVersion CLI option should change to something like API version, or otherwise make it clear that this has nothing to do with the user's data.

Any updates on that or anyone who can point me in the right direction in the meantime? Help much appreciated so we can test the uploading before we go to the field in a week or so!

Thanks much!

Best,
Caroline

Jeff_Beorse · April 8, 2018, 2:26am

I believe these upload features were mostly implemented by @linl33 and @clarice_larson . I know Clarice is on vacation for the next week, but maybe Li can provide some rough tips in the meantime.

linl33 · April 9, 2018, 6:04am

Hi Caroline,

The option for uploading data using Suitcase is update. This feature is only available from the command line.

To use the update function of Suitcase, invoke the Suitcase jar with the -update switch.
For example,
If I have the following server configuration


Server Address	https://odk.example.com
App Id	default
Username	admin
Password	mySecurePassword123

And I want to update the table with table id household using data stored in a CSV located at /home/user/household_april.csv.

I would use the following command:

java -jar 'ODK Suitcase v2.0.3 rev 220.jar' \
  -cloudEndpointUrl 'https://odk.example.com' \
  -appId 'default' \
  -dataVersion 2 \
  -username 'admin' \
  -password 'mySecurePassword123' \
  -update \
  -tableId 'household' \
  -path '/home/user/household_april.csv'

The CSV file used for this operation is mostly the same as the one used for data pre-loading on Android, with one small difference.

The first column of the CSV file has to be an operation column.
The valid values for this column are: FORCE_UPDATE, UPDATE, NEW and DELETE
The value in this column instructs Suitcase how to treat that row.

UPDATE and FORCE_UPDATE
Both of these options are used for updating data on the server. They update data on the server by matching the _id. FORCE_UPDATE retries updates that failed with a more aggressive strategy.

NEW
Used for creating new rows.

DELETE
Used for deleting rows. Rows are matched using _id.

elmps2018 · April 9, 2018, 2:28pm

Dear Li,

Thank you for this careful description. This sounds like exactly what we need! We will test it out and let you know how it goes.

I don’t think this is documented in ODK-X documentation — ODK-X Docs yet, so once we have worked through all the steps we can document that.

Best,
Caroline

elmps2018 · April 9, 2018, 2:33pm

And already thought of a question! So we are not using cloudendpoint, just aggregate on a SQL server (for fielding, we’ll test it first on an appengine setup). Will this still work just as written, or do we need to modify the command in any way?

Thanks much!

Best,
Caroline

linl33 · April 9, 2018, 6:05pm

Aggregate will work just fine. Cloud Endpoint describes Sync Endpoint, Aggregate (with Tables extension) and any future compatible implementation.

elmps2018 · April 13, 2018, 3:25am

Happy to report that we have successfully updated some data! Still working through all the details of the different operation options and deciphering the (many pages when it works…) output in terminal. But the concept seems to be working well so far! Extra appreciation to whomever programmed in all the informative error messages for the command that we were able to work through, like “CSV does not contain metadata column:_row_etag” because then I could figure out to solve it by adding that column!

Thanks!

Larry_Li · September 9, 2020, 4:41pm

This solution is not working for “-delete” operation with ODK-X Suitcase v2.1.6. This was the command I tried to run:

java -jar ODK-X_Suitcase_v2.1.6.jar -cloudEndpointUrl precis
e.obgyn.ubc.ca -appId default -dataVersion 2 -username demo -password pwd123 -delete -tableId profileSummary -path C:\ODK_X_Suitcase\profileSummary_link_extra.csv

Here is the error message:

org.apache.commons.cli.UnrecognizedOptionException: Unrecognized option: -delete
at org.apache.commons.cli.DefaultParser.handleUnknownToken(DefaultParser.java:360)
at org.apache.commons.cli.DefaultParser.handleConcatenatedOptions(DefaultParser.java:702)
at org.apache.commons.cli.DefaultParser.handleShortAndLongOption(DefaultParser.java:533)
at org.apache.commons.cli.DefaultParser.handleToken(DefaultParser.java:243)
at org.apache.commons.cli.DefaultParser.parse(DefaultParser.java:120)
at org.apache.commons.cli.DefaultParser.parse(DefaultParser.java:76)
at org.apache.commons.cli.DefaultParser.parse(DefaultParser.java:60)
at org.opendatakit.suitcase.ui.SuitcaseCLI.parseArgs(SuitcaseCLI.java:247)
at org.opendatakit.suitcase.ui.SuitcaseCLI.startCLI(SuitcaseCLI.java:90)
at org.opendatakit.suitcase.Suitcase.main(Suitcase.java:17)

Can someone help post the correct parameters and syntax for Sync Endpoint? Many thanks!

elmps2018 · September 9, 2020, 5:18pm

Hi @Larry_Li! As you are new to the community, when you get a chance, please introduce yourself here. I’d also encourage you to add a real picture as your avatar because it helps build community!

Please note that delete is not a command – it needs to be the value within the operation column of the csv you are uploading, see:

W_Brunette · September 9, 2020, 6:02pm

@Larry_Li welcome to the ODK-X Community!

@elmps2018 is correct if you are trying to delete specific rows. If that is your intent the CSV should have the delete in the operation column with the rest of the row.

IF instead you are trying to delete the entire table including it’s associated files and definitions. Then you can use this command based on me guessing from the line you showed.
java -jar ODK-X_Suitcase_v2.1.6.jar -cloudEndpointUrl https://precise.obgyn.ubc.ca -appId default -dataVersion 2 -username demo -password pwd123 -tableOp DELETE -tableId profileSummary

It would be helpful if you tell us precisely what you are trying to do so we can give improved answers.

Larry_Li · September 10, 2020, 12:50am

@elmps2018 @W_Brunette Many thanks for your immediate help!! You are right and I should run with “-update” command.

This is the file profileSummary_link_extra.csv (1.8 KB) I am using. I changed the value of operation column to “DELETE”, and still kept “_deleted” unchanged (false). I found when I saved as “CSV (UTF-8)” format in Microsoft Excel, it was saved in the “UTF-8 BOM” format. I converted to the “UTF-8” format and it was working well.

By the way, which columns are necessary or unnecessary to keep if I want to delete? What about updating records? Can I update “_id” column to link with a different record in another table?

elmps2018 · September 10, 2020, 2:25pm

Glad that -update worked!

You definitely need the _id and the operation columns. I am not sure if any others are needed, but if you want to change which instances you are altering, you can change the _id to whatever instance you want to delete or update. If you want to do something in a different table, you can, but will need to use the _id within that table and also the right tableID when you are running the command.

What we generally did for updates using Suitcase was to download the data in a csv and then add the operation column to that csv with DELETE or whatever and then upload that file. The processing can be done just in the csv if you are doing only a little bit, or if going to scale you can use STATA or SPSS or R (for instance, we used STATA to identify and prepare a csv to delete completed records) – just be super careful about all the UTF formatting as you note it can get messed up in Excel etc.