If all of the tiles are coming from a single file (via PMTiles), how are they cached by the client? Do clients cache byte ranges, or do you configure things to give the appearance of serving separate files?
The renderer used in the linked article (MapLibre) has an internal tile cache used when you return to a previously loaded area of the map.
For the non-tiled parts of the PMTiles file, like metadata and the directories that store Z/X/Y to offset information, those are LRU cached by the PMTiles implementation. The most widely deployed right now is for TypeScript/Node. The "client" can either be web browser or inside a runtime like Lambda/Cloudflare Workers. On those the cache will persist across invocations using the ephemeral memory of the serverless function.
A level deeper, browsers vary in how they cache Range requests; IIRC Firefox does not cache while others will treat Range as if it was any other HTTP header.
You typically use byte-range caching in libraries—either using an edge worker (a la Cloudflare) to create separate GET requests from the client's perspective or directly in the client.