Uploads got you Down?
Anyone who has attempted to perform a large HTTP file upload from a memory constrained device will quickly discover serious roadblocks. Using the HTTPBodyStream property of an NSMutableURLConnection instead of HTTPBody is simple enough in theory, but it’s not a s simple as sending a single file. There are HTTP headers and query string parameters and cookies and multipart encoding strings and boundary identifiers and all sorts of other things that need to be written in front of, behind, and around the data that you are going to upload, so what kind of input stream can we use?
Apple provides three options, but none of them work very well for our needs. Creating an NSInputStream from an NSData buys us nothing, since that works in the same way as using HTTPBody. Creating one from a file path is a potential option, by writing out all the headers and other parts to a temporary file and then handing it all to the NSMutableURLConnection as a single file stream, but this is incredibly inefficient. Creating a low level BSD socket and reading and writing from a pair of socket streams is a slightly less inefficient option, but is prone to the dangers of using darwin-level API calls that Apple makes no guarantees about.
The obvious solution, of course, is to create your own subclass of NSInputStream that handles both blocks of data and sub-streams that point to underlying file system data, so let’s try that here:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 |
#import "BZMultipartStream.h"
#import <objc/runtime.h>
#define kBZDefaultMimeType @"application/octet-stream"
#define kBZFieldFormat @"--%@\r\nContent-Disposition: form-data; name=\"%@\"\r\n\r\n%@"
#define kBZFieldFooter @"\r\n"
#define kBZDataFormat @"--%@\r\nContent-Disposition: form-data; name=\"%@\"; filename=\"%@\"\r\nContent-Type: %@\r\nContent-Transfer-Encoding: binary\r\n\r\n"
#define kBZFooterFormat @"--%@--\r\n"
@implementation BZMultipartStream {
NSMutableDictionary* properties;
NSData* footer;
NSData* fieldend;
NSMutableData* fields;
NSMutableArray* streams;
NSUInteger index;
unsigned long sent;
}
@synthesize boundary, length, streamStatus, streamError;
- (id) init {
if (self = [super init]) {
properties = [[NSMutableDictionary alloc] init];
boundary = [[NSProcessInfo processInfo] globallyUniqueString];
footer = [[NSString stringWithFormat:kBZFooterFormat, boundary] dataUsingEncoding:NSUTF8StringEncoding];
streams = [[NSMutableArray alloc] init];
fields = [[NSMutableData alloc] init];
fieldend = [kBZFieldFooter dataUsingEncoding:NSUTF8StringEncoding];
length = footer.length;
sent = 0;
index = 0;
}
return self;
}
- (void) appendField:(NSData*)value forKey:(NSString*)key {
NSString* formatted = [NSString stringWithFormat:kBZFieldFormat, boundary, key, value];
NSData* encoded = [formatted dataUsingEncoding:NSUTF8StringEncoding];
[fields appendData:encoded];
[fields appendData:fieldend];
length += encoded.length + fieldend.length;
}
- (void) appendData:(NSData*)data forKey:(NSString*)key withMimeType:(NSString*)mime {
NSString* formatted = [NSString stringWithFormat:kBZDataFormat, boundary, key, key, mime ? mime : kBZDefaultMimeType];
NSData* encoded = [formatted dataUsingEncoding:NSUTF8StringEncoding];
[streams addObject:[[NSInputStream alloc] initWithData:encoded]];
[streams addObject:[[NSInputStream alloc] initWithData:data]];
[streams addObject:[[NSInputStream alloc] initWithData:fieldend]];
length += encoded.length + data.length + fieldend.length;
}
- (void) appendFile:(NSString*)path forKey:(NSString*)key withMimeType:(NSString*)mime {
NSString* formatted = [NSString stringWithFormat:kBZDataFormat, boundary, key, [path lastPathComponent], mime ? mime : kBZDefaultMimeType];
NSData* encoded = [formatted dataUsingEncoding:NSUTF8StringEncoding];
[streams addObject:[[NSInputStream alloc] initWithData:encoded]];
[streams addObject:[[NSInputStream alloc] initWithFileAtPath:path]];
[streams addObject:[[NSInputStream alloc] initWithData:fieldend]];
length += encoded.length + [[[NSFileManager defaultManager] attributesOfItemAtPath:path error:NULL] fileSize] + fieldend.length;
}
- (NSInputStream*) currentStream {
NSInputStream* stream = [streams objectAtIndex:index];
if (stream.streamStatus == NSStreamStatusNotOpen) { [stream open]; }
if (stream.streamStatus == NSStreamStatusAtEnd) {
[stream close];
index++;
stream = [self currentStream];
}
return stream;
}
- (NSInteger)read:(uint8_t *)buffer maxLength:(NSUInteger)maxlen {
streamStatus = NSStreamStatusReading;
if (index >= [streams count]) {
streamStatus = NSStreamStatusAtEnd;
return 0;
}
NSInputStream* stream = [self currentStream];
int read = 0;
while ((read = [stream read:buffer maxLength:maxlen]) == 0) {
stream = [self currentStream];
}
sent += read;
if (sent >= length) { streamStatus = NSStreamStatusAtEnd; }
return read;
}
- (BOOL) hasBytesAvailable { return sent < length; }
- (void) close { streamStatus = NSStreamStatusClosed; }
- (void) open {
if (streamStatus) { @throw @"Stream is already open!"; }
streamStatus = NSStreamStatusOpening;
if (fields.length) { [streams insertObject:[[NSInputStream alloc] initWithData:fields] atIndex:0]; }
[streams addObject:[[NSInputStream alloc] initWithData:footer]];
streamStatus = NSStreamStatusOpen;
}
@end |
So now you have a multiplexed input stream, and in all your local testing it works great. So, it’s ready for prime-time, right? Whoops, not quite. You’ll quickly discover that mysterious, undocumented calls are being made on your subclass that you haven’t implemented or have even even heard of, namely _setCFClientFlags:callback:context: and _scheduleInCFRunLoop:forMode: and that’s crashing your code. What’s going on here?
First a quick note about NSStreams: they are not self-propelled, at least not all of them. For example, if you are writing out to a file, do you want each and every byte written to the disk individually? Or is it better to write to a memory buffer and then flush the buffer periodically? The latter is much more efficient, but who is in charge of the flushing operations? NSStreams answer this question by leaving the option up to the caller by providing a way to supply a runloop that the stream can use to perform it’s operations.
If you implemented your new NSStream to the documentation, you’ll have seen that scheduleInRunLoop:forMode: is required to create a proper NSStream subclass. So why is that method not being called, but this undocumented _scheduleInCFRunLoop:forMode: call being made instead? This has to do with “Toll Free Bridging”.
Although there are many NS classes that are perfectly “toll free” bridged to their CF counterparts, NSRunLoop is not one of them. CFRunLoop has a similar name, but these two are not compatible to switch between. Instead of calling the NSRunLoop methods that you implemented, the NSMutableURLConnection is trying to use the CF equivalents, methods that are undocumented.
It turns out that working around this is more trivial than it seems once you understand what is going on. The NSInputStreams that reads from NSData and file handles are perfectly capable of managing their own read operations without an external runloop, and since we plan to be multiplexing only those types of input to create our subclass, these two calls can be safely ignored altogether.
We still need them to be answered in our class when the URLConnection calls them, and we need to try to avoid getting caught up in any automatic “undocumented API” filters in the App Store submission. Let’s do that by not having those methods implemented directly, but instead have resolveInstanceMethod try to forward any undocumented calls to methods that don’t directly override undocumented APIs:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
- (BOOL) setCFClientFlags:(CFOptionFlags)flgs callback:(CFReadStreamClientCallBack)cb context:(CFStreamClientContext*)ctx {
return NO;
}
- (void) scheduleInCFRunLoop:(CFRunLoopRef)loop forMode:(CFStringRef)mode {}
- (void) unscheduleFromCFRunLoop:(CFRunLoopRef)loop forMode:(CFStringRef)mode {}
+ (BOOL) resolveInstanceMethod:(SEL) selector {
NSString * name = NSStringFromSelector(selector);
if ([name hasPrefix:@"_"]) {
Method method = class_getInstanceMethod(self, NSSelectorFromString([name substringFromIndex:1]));
if (method) {
class_addMethod(self, selector, method_getImplementation(method), method_getTypeEncoding(method));
return YES;
}
}
return [super resolveInstanceMethod:selector];
} |
Simply add these lines to the class above and presto, now you have a fully functional NSInputStream that multiplexes substreams and allows you to upload any number of large files without ruining your application’s memory profile.
2 Comments
Thank you, Mason. That was a great efficient solution, works well when application is on the foreground. We had some hiccups using this in the background. Would you publish file stream solution?
Can you give a working example using this code? I would greatly appreciate it. Thanks!