home articles sponsorship

Working with tempfiles in Ruby

A while back, I needed to create XML files, send them to a distant server and delete them once the transfer completed.

At first, I thought about creating those files into my app’s tmp directory. Then, a cron job would run daily to delete them. It would have worked, but I wasn’t very happy with it.

So after looking on the internet, I came across the Ruby Tempfile object (and I - for the umpteenth time - rejoiced in using Ruby as my main programming language).

Tempfile is a utility class for managing temporary files. It behaves just like a File object, and you can perform all the usual file operations on it: reading data, writing data, changing its permissions, etc. the Ruby documentation

How-to tempfile in Ruby?

Let’s create a simple example. First, I’ll create an object that generates a tempfile when instantiated. Then I’ll export it on demand.

  class FileExporter
    def initialize
      @my_file = create_file
    end

    def create_file
      Tempfile.open do |file|
        file.write('Bob wuz here.')
        file.rewind
        file
      end
    end

    def export_file
      Net::FTP.open('my_distant_server_address.com') do |ftp|
        ftp.login
        ftp.putbinaryfile(@my_file, remotefile = File.basename(@my_file))
      end
    end
  end

There! Here is the cool thing: once I’m done with my FileExporter, my tempfile will be automatically deleted. No need to do it manually.

But are there any differences between Tempfile and File objects?

Tempfiles versus files

When you create a file, you can access it and perform operations on it from anywhere in your codebase as long as you have its path. Tempfiles behaves slightly differently.

When you create a tempfile, you can perform operations on it as long as its reference exists. Once the reference disappears, the garbage collector automatically claims the tempfile.

Let me show you.

First, let’s create a normal file.

  class FileCreator
    def create_file
      file = File.new('my_file.md', 'r+')
      file.write('Bob wuz here.')
      file.rewind
      file
    end
  end
  file_creator = FileCreator.new

  my_file = file_creator.create_file # => #<File:my_file.md>
  File.basename(my_file.path)        # => "my_file.md"
  my_file.read                       # => "Bob wuz here."

You’ll notice some subtleties:

Now, let’s create a tempfile.

  class FileCreator
    def create_file
      Tempfile.open do |file|
        file.write('Bob wuz here.')
        file.rewind
        file
      end
    end
  end
  file_creator = FileCreator.new

  my_file = file_creator.create_file # => #<File:my_file.md>
  File.basename(my_file.path)        # => "20200123-83768-1wrlh4s"
  my_file.read                       # => IOError: closed stream
  my_file.closed?                    # => true

Here, the tempfile doesn’t need any input on my part for its name. It’s automatically generated. Note that it is possible to specify a name and an extension too.

But the real interesting part is the IOError: closed stream.

It means that I can no longer perform operations on my tempfile - like reading its content - because the stream is now unavailable.

And why has the stream become unavailable? Here, I have two suspicions:

1) Because my tempfile was automatically closed when leaving its original context (the Tempfile.open do [...] end bit) and claimed by the garbage collector. This is suggested by the Ruby documentation.

I/O streams are automatically closed when they are claimed by the garbage collector. the Ruby doc

2) Because I use Tempfile.open { ... } (instead of Templife.new), Tempfile#close is implicitly called at the end of the block and closes the stream.

In any case, your temporary file will be deleted once the object is finalized (or when you lose the reference to the object).

Tempfiles’ quirks

Tempfiles are extremely useful when handled in strictly defined contexts, like a railway-oriented business transaction. But the following example can create some unwanted problems.

  class FileCreator
    def create_file
      file = Tempfile.new
      file.write('Bob wuz here.')
      file.close
      file.path
    end
  end
  file_creator = FileCreator.new

  my_file_path = file_creator.create_file # => #<File:my_file.md>
  File.read(my_file_path)                 # => "Bob wuz here."

FileCreator#create_file now returns a path, not an actual reference to the file. It means that most of the time, you’ll be able to read your file. But sometimes, you’ll get an Errno::ENOENT: No such file or directory @ rb_sysopen error. Why? Because the garbage collector will have claimed your file during one of its collections. One solution would be to return the actual file - hence a reference.

So be careful when using tempfiles accross methods, classes, etc.

Read more about these quirks here:

This whole tempfiles-are-garbage-collected-and-become-unavailable thing still feels a little fuzzy to me. But I’ll keep digging at memory allocation.

Thank you to the redditors who helped make this article better through their suggestions and questions.

Noticed something? Ping me on Twitter or create an issue on GitHub.

Cheers,

Rémi