upon learning about postgresql arrays and mapserver
I learnt something new this week (week 2 in my new job)- it’s probably not new to everyone else, but just in case someone is interested I thought I would document it…
Scenario: You have some Ordnance Survey Mastermap data in a PostgreSQL database imported using OGR2OGR. You wish to display it using Mapserver. The key fields you are interested in for styling your data are “descriptivegroup”, “descriptiveterm” and “make”. These dictate the actual detail about the styling, such as the type of building or type of land.
Problem: These fields may well appear in PostgreSQL as arrays rather than simple text, in other words they may hold more than one value per record. Querying arrays in PostgreSQL requires a slightly different syntax to querying normal text fields (this was the first thing I didn’t know beforehand). This means that simple select queries don’t work in MapServer when you try and filter data to style it in different ways.
My Solution (caution, not necessarily the best or most elegant): Use the PostgreSQL function array_to_string in the query MapServer sends to PostgreSQL. This has the form array_to_string(array, ‘delimiter’) to create a text string from your array using your chosen character to split the terms up. However, it would appear that you can’t simply ask for array_to_string(descriptivegroup,'|') in your select query, you also need to ask for descriptivegroup. Then I used regular expressions to search for the term I need in my Layer Class styling, like so:
DATA “select wkb-geometry from (select wkb_geometry, descriptivegroup, array_to_string(descriptivegroup, ‘|') as dgrp, fid from topographicarea) as foo using unique fid using unique srid=27700”
…
LAYER
…
CLASSITEM “dgrp”
CLASS
EXPRESSION /^Inland Water$/
…
END
END
…
END
So basically I’m asking for any data from the topographicarea table where the column “descriptivegroup” has the term ‘Inland Water’ in it at any point. And I’m being good and using the UNIQUE terms for best performance of course!
Now, there are other ways of doing this. I could have converted the data to strings in PostgreSQL rather than leaving it as arrays. However, next time I updated the Mastermap data, I’d have to remember to do the conversion again, or the map wouldn’t display correctly. Alternatively, I could have created a view in PostgreSQL that did the conversion to string, and then simply query the view in MapServer. I’d be interested in knowing which option is better for performance actually- converting the data into a view in the database, or doing it on the fly. If there’s a simpler solution to the problem, I’d also like to know, though I might sulk a little bit if it turns out that I missed something obvious…