Home > cocoa, iphone > Drop-in offline caching for UIWebView (and NSURLProtocol)

Drop-in offline caching for UIWebView (and NSURLProtocol)

January 29th, 2012 Leave a comment Go to comments

The most up-to-date source for this is now available at github.

Your programs need to deal gracefully with being offline. Mugunth Kumar has built an excellent toolkit that manages REST connections while offline called MKNetworkKit, and Chapter 17 of our book is devoted to the ins-and-outs of this subject.

But sometimes you just have a simple UIWebView, and you want to cache the last version of the page. You’d think that NSURLCache would handle this for you, but it’s much more complicated than that. NSURLCache doesn’t cache everything you’d think it would. Sometimes this is because of Apple’s decisions in order to save space. Just as often, however, it’s because the HTTP caching rules explicitly prevent caching a particular resource.

What I wanted was a simple mechanism for the following case:

  • You have a UIWebView that points to a website with embedded images
  • When you’re online, you want the normal caching algorithms (nothing fancy)
  • When you’re offline, you want to show the last version of the page

My test case was simple: a webview that loads cnn.com (a nice complicated webpage with lots of images). Run it once. Quit. Turn off the network. Run it again. CNN should display.

Exisiting solutions

The ever-brilliant Matt Gallagher has some interesting thoughts on how to subclass NSURLCache to handle this, but I find his solution fragile and unreliable, especially on iOS 5. The HTTP caching rules are complicated, and in many cases you need to connect to the server to re-validate your cache before you’re allowed to use your local copy. Unless everything works out perfectly, his solution may not work when you’re offline, or may force you to turn off cache validation (which could make your pages go stale).

AFCache is also promising, using essentially the same approach. I haven’t found the offline support to work very well, at least in my tests, for the same reasons as Matt’s solution. It’s designed to be an advanced HTTP-caching solution. The docs are limited and I couldn’t get it to pass my CNN test.

RNCachingURLProtocol

So, I present RNCachingURLProtocol. It isn’t a replacement for NSURLCache. It’s a simple shim for the HTTP protocol (that’s not nearly as scary as it sounds). Anytime a URL is download, the response is cached to disk. Anytime a URL is requested, if we’re online then things proceed normally. If we’re offline, then we retrieve the cached version. The current implementation is extremely simple. In particular, it doesn’t worry about cleaning up the cache. The assumption is that you’re caching just a few simple things, like your “Latest News” page (which was the problem I was solving). It caches all HTTP traffic, so without some modifications, it’s not appropriate for an app that has a lot of HTTP connections (see MKNetworkKit for that). But if you need to cache some URLs and not others, that is easy to implement.

First, a quick rundown of how to use it:

  1. At some point early in the program (application:didFinishLaunchingWithOptions:), call the following:

    [NSURLProtocol registerClass:[RNCachingURLProtocol class]];

  2. There is no step 2.

Since RNCachingURLProtocol doesn’t mess with the existing caching solution, it is compatible with other caches, like AFCache. In fact, the technique used by RNCachingURLProtocol could probably be integrated into AFCache pretty easily.

The cache itself is stored in the Library/Caches directory. In iOS 5, this directory can be purged whenever space is tight. Keep that in mind. You may want to store your caches elsewhere if offline access is critical.

Understanding NSURLProtocol

An NSURLProtocol is a handler for NSURLConnection. Each time a request is made, NSURLConnection walks through all the protocols and asks “Can you handle this request (canInitWithRequest:)?” The first protocol to return YES is used to handle the connection. Protocols are queried in the reverse order of their registration, so your custom handlers will get a crack at requests before the system handlers do.

Once your handler is selected, the connection will call initWithRequest:cachedResponse:client: and then startLoading. It is then your responsibility to call the connection back with URLProtocol:didReceiveResponse:cacheStoragePolicy:, some number of calls to URLProtocol:didLoadData:, and finally URLProtocolDidFinishLoading:. If these sound similar to the NSURLConnection delegate methods, that’s no accident.

While online, RNCachingURLProtocol just forwards requests to a new NSURLConnection, making copies of the results, and passing them along to the original connection. When offline, RNCachingURLProtocol loads the previous result from disk, and plays it back to the requesting connection. The whole thing is less than 200 lines of pretty simple code (not counting Reachability, which I include from Apple’s sample code to determine if we’re online).

There’s a subtle problem with the above solution. When RNCachingURLProtocol creates a new NSURLConnection, that new connection has to find a handler. If RNCachingURLProtocol says it can handle it, then you’ll have an infinite loop. So how do I know not to handle the second request? By adding a custom header (X-RNCache) to the HTTP request. If it’s there, then we’ve already seen this one, and the handler returns NO.

Again, this intercepts all HTTP traffic. That could intercept pages you don’t want. If so, you can modify canInitWithRequest: to select just things you want to cache (for instance, you could turn off caching for URLs that include parameters or POST requests).

Wrap-up

This technique isn’t a replacement for a full caching engine like AFCache or an offline REST engine like MKNetworkKit. It’s intended to solve a single, simple problem (though it can be extended to solve much more complicated problems). NSURLProtocol is extremely powerful, and I’ve used it extensively when I need to eavesdrop on network traffic (such as in PandoraBoy’s several ProxyURLProtocol classes). It’s well-worth adding to your toolkit.

The code is in the attached project. Look in RNCachingURLProtocol.m.

EDIT: Be sure to see Nick Dowell’s modification in the comments to handle HTTP redirect.

EDIT2: In cachePathForRequest:, I use hash to uniquely identify the URLs. For long, similar URLs, this collides a lot (See CFString.c for comments on how the hash function is implemented.) The better thing to use is MD5 or SHA1 or something, but those aren’t built-in on iOS prior to iOS5, so you’d have to implement your own (and I don’t need it that badly for my current projects). This is something you’d want to fix before using this seriously.

CachedWebView Example Project

Categories: cocoa, iphone Tags:
  1. August 3rd, 2012 at 09:08 | #1

    @bhavik There is no built-in limit other than the device storage.

  2. bhavik
    August 6th, 2012 at 03:35 | #2

    thanks Rob.

  3. Kama
    August 6th, 2012 at 05:11 | #3

    hi rob, I had implemented This RNCachingURLProtocol,and All my cache data is being stored in cache directory withought any directory so if i want to create user defined directory in Cache folder for cache data then how can i achieve that?

  4. Kama
    August 16th, 2012 at 05:07 | #4

    hey Rob I need your help on one situation,

    I m using yr RNCachingURLProtocol to store the pages of uiwebview in Cache and i had make some changes for whenver the user open the app it will shown a cached data in uiwebview and i had applied some logic which send me if any update is available from my server page.and whenever any update found in particular pages in that has cached before then i m making one update button to notify user to update the cached page and when the user clicks that update button i need that i just update that particular page not whole cache again.i ahd tried to give urlconnection whenever i find the updated url which provided from the server.but however its not working that much good.after update button click whnever i m trying to open ma application again then it showing me page withought the css …please give me proper suggestion how to progaramitcally i can do that..tnx in advance

  5. Dennis
    October 30th, 2012 at 09:26 | #5

    Hey Rob. First thank you so much for your work. Just ordered your book also because of the caching chapter. I think I have a simple task: I want to use your class by only changing the caching location (as mentioned in your blog) to prevent the system to delete it at any time. So I changed the cachePath in cachePathForRequest: to be inside the NSApplicationSupportDirectory (for iOS5.1 with the key NSURLIsExcludedFromBackupKey to not do icloud backup). Everything seems to work, even if I cut off the internet connection and start using cached data. But when I go to the home screen and reopen the app I have a blank page. Do you have any idea why it’s not using the new cache location? I am fairly new to caching so I might be missing something…Thx for any help!

    • October 30th, 2012 at 10:17 | #6

      My first suspicion would be that you’re not registering the NSURLProtocol early enough. It has to be registered before the web view tries to load its page.

      My next suspicion is that you have a problem in application:willEnterForeground:. If you go to the home page and come back immediately, you probably don’t actually terminate the app. I’d make sure that you’re not messing something up there.

      Of course I’d start by removing the caching (don’t register the protocol) and making sure that the problem doesn’t persist. Often when chasing these kinds of problems, I discover that the system is actually broken somewhere else entirely. UIWebView is sometimes a bit funny about coming back from the background, with or without caching.

  6. Jan
    October 31st, 2012 at 05:31 | #7

    Hi Rob,

    I’m working on an iPhone project where I would like to be able to update my “about us” page without having to send out updates or push notifications to users. To do this I was planning to use a UIWebView to display my page.

    This page wont update that often so I am looking for a fairly permanent solution to this, but I would also like a general solution to this kind of caching because I may do this in a few other areas too.

    I implemented a hit and miss solution using NSUrlCache but it sometimes loses the style sheets or an image or two. Would your solution allow permanent/reliable storage of web pages in my application? The reason I ask is because I am currently developing in MonoTouch and would need to translate the objective C to C# to suit my needs.

    Any advice you have on the subject would be greatly appreciated.

    Jan

    • October 31st, 2012 at 14:52 | #8

      Yes, this is the use-case that I originally developed this technique for. I have no idea about how it would integrate with MonoTouch.

  7. Dennis
    October 31st, 2012 at 13:14 | #9

    Thanks a lot. Your first guess was already right.

  8. Dennis
    October 31st, 2012 at 16:17 | #10

    One more thing: I just realized that the here used NSURLConnectionDelegate methods are deprecated! After doing some research (http://stackoverflow.com/questions/7862316/ios5-nsurlconnection-methods-deprecated) I realized the some methods were moved to another protocol (NSURLConnectionDataDelegate) which is not documented! However, as I understand this, you still can use the same methods since they are just in a different place. The only lack here is inside the documentation.

  9. Jan
    November 1st, 2012 at 04:40 | #11

    @Rob Napier

    Ill let you know how I get on!

  10. Jan
    November 1st, 2012 at 12:04 | #12

    Got it working in the end. Really appreciated your example, had been trying to get this working for a long time now. @Rob Napier

  11. Matthew Adams
    November 10th, 2012 at 11:40 | #13

    Huge thanks for this! Helped me out a lot.

  12. Rockky
    December 11th, 2012 at 04:23 | #14

    Can we pass username and password via this like we can using ASIHTTP request ? Also if i am loading multiple url’s does it cache data for all to be viewed offline .

    Thanks.

    • December 14th, 2012 at 12:41 | #15

      This cache caches everything that is returned for a given URL, and associates it with that URL string. So it all depends on how you implement the username and password passing and what behavior you want. If you request “http://www.example.com/something” and provide a session cookie, then the result will be cached such that the next time you ask for “http://www.example.com/something” you will get the same data back.

  13. Juanita
    December 25th, 2012 at 22:05 | #16

    Hey rob ! Excellent article ! Been lookin for sumthing like this for a while. I hav a doubt though. Im makin an app that should cache 10 different web apps. How can i make multiple caches ? Pls reply :)

    • January 2nd, 2013 at 13:15 | #17

      The code as written will cache every URL, so there shouldn’t be a need for separate caches for separate websites. If you want to control precisely which URLs are cached and where, you would make those decisions in startLoading.

  14. Rony
    December 26th, 2012 at 02:44 | #18

    Hey thank you so much this !! I wanted to ask if i can make multiple caches as i wanted to store around 10 webapps in my app

  15. Ulli
    January 3rd, 2013 at 15:37 | #19

    Hi Rob,

    THX for this great article! I´ve one Problem: i must be able, to delete cached entries. You answerde Juanita:

    The code as written will cache every URL, so there shouldn’t be a need for separate caches for separate websites. If you want to control precisely which URLs are cached and where, you would make those decisions in startLoading.<

    I have no idea, how to do this. :-( Can you please give me a short sample?

    Many TIA! Regards Ulli

    • January 3rd, 2013 at 15:48 | #20

      To deleted cached entries, just delete the file.

      [[NSFileManager defaultManager] removeItemAtPath:[self cachePathForRequest:request] error:&error]

      You might want to re-write cachePathForRequest: to take an NSURL rather than an NSURLRequest, but it should be straightforward. Note that this is just the bare bones of a caching engine, focused on how to use NSURLProtocol correctly. If you need a fully implemented cache engine with lots of features, see AFCache.

  16. Ulli
    January 4th, 2013 at 10:46 | #21

    Dear Rob,

    many THX. It works :-)

    Regards Ulli

  17. Ulli
    January 5th, 2013 at 11:44 | #22

    Hi Rob,

    I still have one Newbe-Problem: how can I tell the Protocol, that an entry should be removed? May I set a Property (how??)? Or do I have access to the WebView? Absolutly clueless… :-( May u please help me again?

  18. January 5th, 2013 at 11:57 | #23

    The NSURLProtocol does not remove files from the cache. The NSURLProtocol is just a hook into all NSURLConnection-based connections (this includes UIWebview). It is up to you to decide what it should do within that hook, and how you would like to manage your cache outside of that hook. If you want to delete files from your cache, then just remove the files. Look in cachePathForRequest: for how the path for a given URL is determined.

  19. gambogo
    January 6th, 2013 at 22:54 | #24

    Hey Rob, this is very useful. thanks very much

  20. February 18th, 2013 at 03:21 | #25

    Hi, Rob

    Can u speak Chinese?

    • February 18th, 2013 at 11:09 | #26

      不好. It’s a hobby, but without a dictionary and some time I’m pretty lost. I’m only about a step up from the typical 你好谢谢 tourist :D

  21. February 19th, 2013 at 04:28 | #27

    @Rob Napier It’s so surprise that u replied me, because I just asked as joke. It’s so glad to read your article even I can not understand all of it. I have a puzzle that can RNCachingURLProtocol use when my app have many urls to cache, and how to cleanup the cache? I’m a Chinese iOS software engineer with poor English.

    Thanks a lot.

  22. Smurfie
    February 27th, 2013 at 12:00 | #28

    Hey Rob – thank you for your solution. Is it possible, that only websites are beeing cached? While all the request that are fired by RestKit are not? My motivation behind this is, that in my JSON-responses I have data that needs to be encrypted and shouldn’t lie unencrypted in your cache.

    Thanks Smurfie

    • February 27th, 2013 at 13:02 | #29

      It will cache everything that goes through NSURLConnection. That includes web views. I haven’t looked at how RestKit is implemented to see how it works, but it probably would cache that data. You can decide what you want to cache or not in +canInitiWithRequest. Just return NO for any requests you don’t want to cache.

  23. March 4th, 2013 at 13:30 | #30

    Hi, Work great but it has problem with Google Page Translate. Page will not be reloaded with translated text.

    Hope there’s a solution for this.

    • March 19th, 2013 at 14:03 | #31

      I don’t really know what this means. The cache engine maps URLs to results. So for a given URL, you will always receive the same result. URLs are effectively GET requests, so no POST information can be cached. Nor can JavaScript results be cached (since they’re not URL requests). So if google translate is implemented as a specific URL, this cache will keep track of the result returned. If it is not mapped to a specific URL, then you’ll likely have to build a custom caching system for that.

  24. smurfie
    March 6th, 2013 at 13:35 | #32

    Hey Rob – struggeled around for several hours today because every PUT or POST request I sent resulted in a GET request. Then your framework came to my mind.

    I think you should fix this. I solved it by filtering the requests.

  25. David James
    April 5th, 2013 at 05:24 | #33

    Hi Rob,

    In RNCachingURLProtocol startLoading method, when you pass the cached response to the URL loading system, you include a caching policy of NSURLCacheStorageAllowed. Why did you do this if the intent is to handle all the caching yourself? Why not use NSURLCacheStorageNotAllowed ?

    • April 5th, 2013 at 19:15 | #34

      Check the GitHub repository. It should be NSURLCacheStorageNotAllowed there.

  26. Reed
    May 5th, 2013 at 18:43 | #35

    Hi, I have been researching the best way to do offline caching and this seems great, but I am still considering just using HTML5 local storage. What are some of the advantages of this method over using HTML5? Thanks!

    • May 5th, 2013 at 19:54 | #36

      This technique is a very primitive solution for very simple web caching, mostly of a handful of simple HTML pages. It is not intended as a general-purpose caching system for complex pages. (It is intended mostly to demonstrate how to use NSURLProtocol, not HTTP caching.) HTML5 local storage I believe would require you to modify your web pages, while NSURLProtocol requires that you modify your client. If you control both, I’d say do whatever is more comfortable for you.

  27. besimo
    May 28th, 2013 at 10:40 | #37

    Hey Rob, thanks for that great framework.

    One question I have: You once answered “…Nor can JavaScript results be cached (since they’re not URL requests)….”. Does any of the other frameworks beside yours offer that? I have a lot of AJAX Calls, which makes about 90% off the content.

    Best, besimo

    • May 28th, 2013 at 11:50 | #38

      Anything that creates an URL request will be cached. That generally includes AJAX (provided the resulting URL is exactly the same each time). A “JavaScript result” refers to HTML that is generated by JS and never becomes an URL request. That said, heavily AJAX’ed pages are often quite difficult to cache correctly by any system if they don’t create consistent URLs. (And AJAX systems often intentionally prevent caching, since the results are often ephemeral.)

      Note that RNCachingURLProtocol is not a full caching solution. It is sample code to demonstrate how to write NSURLProtocol code. I do happen to use it in some production code, and apparently many others do as well, but it is intentionally very simple and limited. It is assumed that the caller will rewrite parts of it to suit the product’s specific needs.

      I recommend reading http://nshipster.com/nsurlcache/ for good information on configuring NSURLCache directly. Since iOS 5, it can handle many more cases directly.

Comment pages
1 2 588
  1. February 13th, 2012 at 11:21 | #1