About

I'm Mike Pope. I live in the Seattle area. I've been a technical writer and editor for over 30 years. I'm interested in software, language, music, movies, books, motorcycles, travel, and ... well, lots of stuff.

Read more ...

Blog Search


(Supports AND)

Google Ads

Feed

Subscribe to the RSS feed for this blog.

See this post for info on full versus truncated feeds.

Quote

There is nothing sadder than mass indivdualism.

— "Mike"



Navigation





<December 2017>
SMTWTFS
262728293012
3456789
10111213141516
17181920212223
24252627282930
31123456

Categories

  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  

Contact

Email me

Blog Statistics

Dates
First entry - 6/27/2003
Most recent entry - 12/8/2017

Totals
Posts - 2465
Comments - 2567
Hits - 2,005,851

Averages
Entries/day - 0.47
Comments/entry - 1.04
Hits/day - 380

Updated every 30 minutes. Last: 1:52 AM Pacific


  09:52 AM

Carrying on with adventures using the Tumblr API. (Part 1, Part 2)

As noted, I decided that I wanted to create a local HTML file out of my downloaded/exported Tumblr posts. In my initial cut, I iterated over the list of TumblrClass instances that I'd assembled from the downloaded posts, and I then wrote out a bunch of hard-coded HTML. This worked, but was inflexible, to say the least—what if I wanted to reorder items or something?

So I fell back on yet another old habit. I created a "template" of the HTML block that I wanted, using known strings in the template that I could swap out for content. Here's the HTML template layout, where strings like %%%posttitle%%% and %%%posturl%%% are placeholders for where I want the HTML to go:
<!-- tumblr_block_template.html -->
<div class="post">
    <div class="posttitle">%%%posttitle%%%</div>
    <div class="postdate">%%%postdate%%%</div>
    <div class="posttext">%%%posttext%%%</div>
    <div class="postsource">%%%postsource%%%</div>
    <div class="posturl"><a href="%%%posturl%%%"
        target="_blank">%%%posturl%%%</a></div>
    <div class="postctr">[%%%postcounter%%%]&nbsp;
        <span class="posttype">%%%posttype%%%</span>
    </div>
</div>
The idea is to read the template, read each TumblrClass item, swap out the appropriate member for the placeholder, and build up a series of these blocks. Here's the code to read the template and build the blocks of content:
html_output = ''
 
html_file = open('c:\\Tumblr\\tumblr_block_template.html', 'r')
html_block_template = html_file.read()
html_file.close()
 
ctr = 0
for p in sorted_posts:
    new_html_block = html_block_template
    ctr += 1
    new_html_block = new_html_block.replace('%%%posttitle%%%', p.post_title)
    new_html_block = new_html_block.replace('%%%postdate%%%', p.post_date)
    new_html_block = new_html_block.replace('%%%posttext%%%', p.post_text)
    new_html_block = new_html_block.replace('%%%postsource%%%', p.post_source)
    new_html_block = new_html_block.replace('%%%posturl%%%', p.post_url)
    new_html_block = new_html_block.replace('%%%postcounter%%%', str(ctr))
    html_output += new_html_block
To embed these <div> blocks into an HTML file, I did the same thing again—I created a template .html file that looks like this:
<!-- tumblr_template.html -->
<html>
<head>
  <link rel="stylesheet" href="tumbl_posts.css" type="text/css">
  <meta http-equiv="content-type" content="text/html;charset=utf-8">
</head>
<body>
<h1>Tumblr Posts</h1>
%%%posts%%%
</body>
</html>
With this in hand, I can read the template .html file and do the swap thing again, and then write out a new file. To actually write the file, I generated a timestamp to use as part of the file name: 'tumbl_bu-' plus %Y-%m-%d-%H-%M-%S plus '.html'.

There was one complication. I got some errors while writing the file out, which turned out to be an issue with Unicode encoding—apparently certain cites that I pasted into Tumblr contain characters that can’t be converted to ASCII, which is the default encoding for writing out a file. The solution there is to use the codecs module to convert. (It’s possible that this is a problem only in Python 2.x.)

Here’s the complete listing for the Python script. (I wrapped some of the lines in a Python-legal way to squeeze them for the blog.)
import datetime,json,requests
import codecs # For converting Unicode in source

class TumblrPost:
def __init__(self,
post_url,
post_date,
post_text,
post_source,
post_title,
post_type):
self.post_url = post_url
self.post_date = post_date
self.post_text = post_text
self.post_source = post_source
self.post_type = post_type
if post_title is None or post_title == '':
self.post_title = ''
else:
self.post_title = post_title

all_posts = [] # List to hold instances of the TumblrPost class
html_output = '' # String to hold the formatted HTML for all the posts
folder_name = 'C:\\Tumblr\\'

# Get the text posts and add them as TumblrPost objects to the all_posts_list
print "Fetching text entries ..."
request_url = 'http://api.tumblr.com/v2/blog/mikepope.tumblr.com/posts/text?api_key=[MY_KEY]'
offset = 0
posts_still_left = True
while posts_still_left:
request_url += "&offset=" + str(offset)
print "\tFetching text entries (%i) ..." % offset
tumblr_response = requests.get(request_url).json()
total_posts = tumblr_response['response']['total_posts']
for post in tumblr_response['response']['posts']:
# See https://www.tumblr.com/docs/en/api/v2#text-posts
p = TumblrPost(post['post_url'],
post['date'],
post['body'], '',
post['title'],
'text') # No source for text posts
all_posts.append(p)
offset += 20
if offset > total_posts:
posts_still_left = False

# Get the quotes posts and add them as TumblrPost objects to the all_posts_list.
print "Fetching quote entries ..."
request_url = 'http://api.tumblr.com/v2/blog/mikepope.tumblr.com/posts/quote?api_key=[MY_KEY]'
offset = 0
posts_still_left = True
while posts_still_left:
request_url += "&offset=" + str(offset)
print "\tFetching quote entries (%i) ..." % offset
tumblr_response = requests.get(request_url).json()
total_posts = tumblr_response['response']['total_posts']
for post in tumblr_response['response']['posts']:
# See https://www.tumblr.com/docs/en/api/v2#quote-posts
p = TumblrPost(post['post_url'],
post['date'],
post['text'],
post['source'], '',
'quote') # No title for quote posts
all_posts.append(p)
offset += 20
if offset > total_posts:
posts_still_left = False

sorted_posts = sorted(all_posts,
key=lambda tpost: tpost.post_date,
reverse=True)

print "Creating HTML file ..."

# Read a file that contains the HTML layout of the posts,
# with placeholders for individual bits of data
html_file = open(folder_name + 'tumblr_block_template.html', 'r')
html_block_template = html_file.read()
html_file.close()

ctr = 0
for p in sorted_posts:
new_html_block = html_block_template
ctr += 1
new_html_block = new_html_block.replace('%%%posttitle%%%', p.post_title)
new_html_block = new_html_block.replace('%%%postdate%%%', p.post_date)
new_html_block = new_html_block.replace('%%%posttext%%%', p.post_text)
new_html_block = new_html_block.replace('%%%postsource%%%', p.post_source)
new_html_block = new_html_block.replace('%%%posturl%%%', p.post_url)
new_html_block = new_html_block.replace('%%%postcounter%%%', str(ctr))
new_html_block = new_html_block.replace('%%%posttype%%%', p.post_type)
html_output += new_html_block

# The template has a placeholder for the content that's generated dynamically
html_file = open(folder_name + 'tumblr_template.html', 'r')
html_file_contents = html_file.read()
html_file.close()
html_file_contents = html_file_contents.replace('%%%posts%%%', html_output)

# Open (i.e., create) a new file with the ability to write Unicode.
# See http://stackoverflow.com/questions/934160/write-to-utf-8-file-in-python
file_timestamp = datetime.datetime.now().strftime('%Y-%m-%d-%H-%M-%S')
with codecs.open(folder_name +
'tumbl_bu-' +
file_timestamp +
'.html', 'w', "utf-8-sig") \
as new_html_file:
new_html_file.write(html_file_contents)
new_html_file.close()

print 'Done!'

[categories]  

|