Cargo

From Support Wiki
Jump to navigation Jump to search
Cargo
Version 3.3
Author(s) Yaron Koren
Requirements MediaWiki 1.23 or greater
Description
An extension that allows for the storage and querying of data contained within templates.
Default?
NoN
For a complete Cargo user guide, see Extension:Cargo on mediawiki.org. This page does not attempt to duplicate information available there, and you should take advantage of both resources.
For an example Cargo setup, please see the SORCERER wiki.

Cargo is an extension that lets pages on the wiki talk to each other. Imagine having a spreadsheet with data about your game that has a bunch of pages. One sheet talks about items. Another one talks about equipment. Yet another one tells you the stats of enemy bosses. Maybe there's one talking about different townspeople and their dialogue lines. You can imagine some columns that these might have - items have purchase costs, sell prices, ingredients, bonus armor. Enemy bosses have HP, defense, weaknesses, strengths, special skill thresholds. Townspeople have a greeting, a goodbye, an occupation, and a house. Cargo lets you actually create "virtual spreadsheets" in the form of "Cargo tables" that contain this information - and then makes them available so that any page on the wiki can use the information.

When you make wiki pages with infoboxes, you're already thinking about your data this way. You can see the similarities in the following questions:

Infoboxes Spreadsheet Analogy Cargo
What types of infoboxes do I need? What Excel workbooks should I make? What tables should I declare?
What fields go into this infobox? What columns should I put into this excel sheet? What fields should I include in my declaration?
What values should I put into this infobox / how should I format them? What type of formatting do I apply to the cells in this column? (Date, decimal, percentage, etc) What types should I assign to these fields? (DateTime, Boolean, String, etc)

Why Cargo?

The reason to use Cargo is so you can reuse the same data on multiple pages across your wiki. Specifically, you can:

  1. Aggregate data. For example, if each "thing" has its own page with an infobox, and you want to make a centralized display of all "things."
    • If you have items with stats, maybe you want to show the stats in the infobox on the item page, but also show them in a sortable table on the "Items" overview page.
    • Or if every city has a population, an area, and whether it's a capital or not. You want to make a list of all cities sorted by population.
  2. Re-aggregate data. For example, if each "thing" has 2-5 components, and each component also has its own page. You want to automatically show what "things" correspond to each "component" on every "component" page. Here, "things" could be "in-game items" with "recipes," or it could be "cities" with "types of crafting locations." The idea is that there's some parent-child relationship and you want the two sides of the same relationship showing up on both the parent and child pages.
    • Maybe you want to show a list of all dialogue lines of every person in one town on one page, and show the list of dialogue options of every blacksmith on another page.
    • Or maybe your game has a crafting system, and you know that a dagger recipe is 2 iron ores. On another page, you want to show every item that requires at least one iron ore.

You could also use Cargo to de-aggregate data, for example if you want to use a bot to upload every single piece of item data to 1 single page in a Data namespace (say there's only 50 items, so this is reasonable to do). Then you could query that one bot-generated page from each infobox. You should only do this if you are already using Cargo for another purpose. Otherwise, there's more efficient & easier ways to achieve this structure, such as loading JSON data in Lua or having your bot edit 50 pages instead of 1, which is very achievable using the Python libraries mwclient and mwparserfromhell.

Vocabulary

Term Spreadsheet Equivalent Definition
Table Excel sheet A collection of similarly typed data objects along with their properties
Declare Create new file Tell the wiki about a new table you want it to make
Write to Save Save data to a table
Read from Open Get data from a table
Store Add a new row and save
"Where" condition Filtering a column A condition that you require all returned fields to satisfy in order to be included in the result

How do I start?

Since Cargo is an extension, and not a default extension, the first step is to get Cargo on your wiki, which you can do by talking to staff. If you have development experience, it's highly recommended to use the Scribunto extension and write Lua modules as well - but if you don't want anything to do with coding, that's fine! You can still use Cargo without writing any line of code.

Once Cargo is enabled, there's three steps to creating a table.

Step 1: Declaration

In the Excel analogy, this is like opening a new workbook, creating column labels, and applying cell formatting to each column. That's it. There's no data stored yet, nothing is being retrieved, you are just telling the wiki what fields to expect when you do eventually store data.

A Cargo declaration for a set of fictional cities that have no relationship to cities from any popular video game might look something like this:

{{#cargo_declare:_table=Cities
|name=String
|population=Integer
|area=Float
|motto=Text
|isCapital=Boolean
}}

First we are giving the table a name: "Cities." Then we're saying, "Things in this table have 5 pieces of information we need to know." These are as follows:

  1. The name of the city. This is going to be the primary identifier of each row, so we'll make sure it's a string type.
    • A string is the default type. String means "a set of characters." The difference between "String" and "Text" isn't too important most of the time, but we'll elaborate a bit below.
    • If you had wiki markup that you wanted to store, and be rendered by default, you could instead use the data type Wikitext string
  2. The city's population. It's a number, and you can't have a fraction of a person, so this is an Integer.
    • An integer is a number that cannot have any fractional/decimal component.
  3. The city's area in some video game metric. You CAN have fractional units here, so we'll use a Float.
    • A float is a number that CAN have fractional/decimal component.
  4. The city's motto. We don't anticipate that we'll ever fetch data based on the value here, so to improve performance we'll store this as Text. But, it would also be totally valid to store this as a String. When in doubt, use a String.
    • Again, read about indexing to learn the difference between String and Text. But generally, a text field is a large blob of data, and a string is a short piece of data that identifies a characteristic of the entity in question.
  5. Whether or not this city is its country's capital. This can be either True or False. The data type that represents true/false values is called "Boolean."
    • A boolean is a variable that has two options: "True" and "False." False means "Not True."
      • In Cargo, you should store a boolean as either Yes or No.
      • And you should query it as ="1" (true/yes) or ="0" (false/no). We'll see an example later on.

After you declare a table, you'll need to perform one more step before it's ready to be used - if you're an admin on your wiki, you can click a button that says "Create Data Table" on the template page after you've declared it and saved. If not, you can ask your Wiki Manager or another admin to help you. Every time you add or remove a field, or change the type of one of the fields, you'll have to recreate the table as well. This is done basically the same way - except now, instead of "Create" it'll say "Recreate." Again, you'll need to be an admin or ask for help from someone who's an admin on your wiki.

For additional information about recreating tables, see Recreating Tables.

Indexing

This section explains why I said you should avoid using Text & Wikitext in "where" or "join" conditions. If you want to just trust me on that, you can skip it.

Looking up data from Cargo tables is a slow process compared to some other things that you might do. In order to make it faster to find the rows that you're looking for when you provide a "where" condition (i.e. a filter), some fields are "indexed." But, indexing a field makes it slower to store in the first place. So to speed up the storing process, Cargo gives you the option of using "text" and "wikitext" for very long fields which won't be indexed. It's fine to query these to DISPLAY, but you should never FILTER based on their values, because without an index that will be super slow. If you're confused about this, you can simply avoid the "Text" and "Wikitext" types. If you think you need them, ask your Wiki Manager for help!

Here's an example to show an example of a lookup that will work well, and one type that won't. Let's say we have a table of townspeople, and the field Greeting is Text type, because some greetings can be very long. The other fields, including City are all type String. We make a query and the "where" acts a filter.

Okay:
|where= City="Whiterun"
|fields= Name, Occupation, Greeting
Filtering with "where" based on City, which is a string field, will work fine.
Not okay:
|where= Greeting="What do you get when you cross an arrow with a knee?"
|fields=Name, Occupation, City
Filtering with "where" based on Greeting, which is a text field, will make the query slower. This type should be avoided.

Where do I declare a table?

Two options:

  1. In the <noinclude></noinclude> section of the template that you're storing the data from.
  2. In a standalone template that isn't used anywhere else.

It's up to you - the only requirement is that it has to be done in the template namespace. My preference is for option 2, but you have to remember to {{#cargo_attach:_table=TABLE_NAME}} the data. This tells the wiki to look for rows on the pages where your attaching template is used.

You can check out our Cities sample data templates on this wiki:

Step 2: Storing Data

Okay great so now we've basically opened a new workbook and added some column labels and types. What next? Well, it's pretty useless without some rows of data added to it. So let's add some data! Typically, you want to store data from an infobox template. (We are not doing so in this example because worldbuilding is hard.)

Here's the example store template from our Cities example:

{{#cargo_store:_table=Cities
|name={{{name|}}}
|population={{{population|}}}
|area={{{area|}}}
|motto={{{motto|}}}
|isCapital={{{isCapital|}}}
}}

You can see this template in action at Cargo/test data.

Step 3: Querying Data

So now we have a beautiful spreadsheet of data. It has a name, and it has a bunch of columns, with a bunch of rows that contain data. That's great and all, but...we haven't actually done anything yet. So let's do something!

"Doing something" means using the function {{#cargo_query:}}. Let's try making a table of cities with a population of at least 10,000. Note that unlike declare and store, this parser function takes an argument of table and not _table ({{#cargo_attach:}} can accept either one). You can also use tables instead of table.

Note, in the following examples, we're writing CONCAT(motto)=motto instead of motto to work around a common display bug with apostrophes.

{{#cargo_query:table=Cities
|fields=name, population, area, CONCAT(motto)=motto, isCapital
|where=population > "10000"
}}
name population area motto isCapital
Riften 17,119 40.6 Welcome to Summoner's - wait wrong Rift No
Windhelm 85,102 10.5 The actually windy city No
Winterfell 10,285 18.3 No, we're not Winterhold. What do you think this is, Skyrim? No

What if we wanted to show only small cities?

{{#cargo_query:table=Cities
|fields=name, population, area, CONCAT(motto)=motto, isCapital
|where=area < "20"
}}
name population area motto isCapital
Dawnstar 6,800 11.3 Good morning! Yes
Windhelm 85,102 10.5 The actually windy city No
Winterfell 10,285 18.3 No, we're not Winterhold. What do you think this is, Skyrim? No

Finally, let's fetch the capital city:

{{#cargo_query:table=Cities
|fields=name, population, area, CONCAT(motto)=motto, isCapital
|where=isCapital="1"
}}
name population area motto isCapital
Dawnstar 6,800 11.3 Good morning! Yes

Turns out, it's pretty easy to do! Imagine for your wiki, if you were showing a list of items and all of their stats, or a list of townspeople and all their dialogue lines, or, or, or...It's pretty endless what you can do.

Advanced queries

See customizing tables for more information about adding markup to |format=table queries if you just want tables with markup.

Maybe you noticed that we didn't have much control over what the output looked like. All we could see was a table with all of the values. What if we want to just show a list of item icons to make a list of everything that one item builds into? What if we want to make a grid with every hero's thumbnail and a link to that hero's page, and display it on the wiki front page? Turns out, you can do all this and more - by setting |format=template (the default is |format=table when you don't set any |format= in your query).

Due to a bug with Cargo, actually using |format=template sometimes displays wrong. We have a workaround for this - using exactly the same syntax as |format=template queries, but {{CargoQuery}}m instead of the actual #cargo_query parser function, you can get around this bug. This workaround does not require you to use any Lua.

When you set |format=template, you can make a template that controls the format of every "row" in the output. So instead of being a table, you can display a list, or just a bunch of icons - whatever you want! Here are a few things you could choose to do:

  • Have the template output one row of a wikitable, but with more advanced styling in each row. This would require you to define both the table header and table end, using either |intro= and |outro= or just including extra elements surrounding your query.
  • Have the template display a thumbnail of one of the values along with its name, to make a navigation grid
  • Use #vardefine and #expr to calculate totals and then print them at the end (though also see Working with numbers)

See the MediaWiki Cargo documentation for more information.

Super-advanced stuff

  • Cargo tables can be joined to each other. The syntax is |join on=, or just join = in Lua.
  • In Lua, there is a dedicated function mw.ext.cargo.query, but if you want to declare or store from Lua, you need to use frame:callParserFunction. See also the Cargo Lua example.
  • Cargo tables are exposed via the MediaWiki api through action=cargoquery. This can be extremely useful.

Working with numbers

Often you want Cargo to return numbers of things. Here is some advice.

SQL commands

  • You can use COUNT(*) to get the total number of entries that match your WHERE
  • You can use COUNT(DISTINCT FieldName) to get the total number of distinct entries in a field, for the rows that match your WHERE
  • SUM, MAX, MIN, etc exist. See Using SQL functions in the main Cargo docs for more information.

Using numbers from Cargo

  • One issue that people frequently encounter is that #cargo_query returns integers/floats with thousands separators. Wrap TRIM() around a field (e.g. |fields=TRIM(GoldCost)=GoldCost) to remove thousand separators if using #expr or other parser functions.
  • Another issue is that Cargo adds some extra HTML to outputs. If you want to do something like {{#expr:{{#cargo_query:QUERY HERE}} * 100 to display a percentage, you need to add the parameter |no html to your query. This will suppress the output of wrapping HTML and return just the number, so that #expr can understand it.
  • If you are using {{CargoQuery}}m for your query, neither of the above problems should occur.

Miscellaneous common problems

  • As mentioned before, |format=template is buggy; use {{CargoQuery}}m instead.
  • When using HOLDS, errors given in Lua are often completely incorrect from the actual issue. Ideally, just don't use HOLDS period. If you need to, ignore the error text if you get any.
  • Similar to the issue with thousands separators and numbers, you also may need to wrap fields that are stored via the {{PAGENAME}} magic word in a TRIM(). (_pageName is a default field but maybe you used {{#titleparts:}} for a field.)
  • Each template can only attach 1 table. So if you need a template to write to multiple tables, you might have to create "fake" templates that do nothing other than attach the table, but then are transcluded by the template doing the storing onto every page along with the store.
    • Similarly, a template can only declare one table.
    • A template CAN, however, both declare and attach - for a maximum of 2 tables available without any workarounds.
    • See Attaching tables for more information.
  • unique is broken, don't use it

Help! My table isn't repopulating!

If you recreate your table and it refuses to repopulate, here's a couple things you can check out:

  1. Correct attach or declare: Are you for sure attaching or declaring your table?
  2. Wrong data types:
    • Are all your integer fields actually integers (no wiki markup, no letters, no + sign, no decimal points)?
    • Are all your float fields actually numbers (no wiki markup, no letters, no + sign)?
    • Are any of your strings too long? By default, a string can be at most 300 characters. You can change this by writing |FieldName=String (size=500), or whatever your desired size is. However, you might prefer to change a long string field to Text.
  3. Does your table have too many columns? The precise number of columns you can have depends on some wiki settings, but if you're at about 60 or so columns, you might worry about this.
  4. Are you waiting long enough? If you've checked EVERYTHING above and are 100% sure it's not one of these issues, try waiting up to ~20-30 minutes for it to start populating.

See also