SamSuka
Touhou-Project.com
Touhou-Project.com

patreon


The Searchers

Hey all, hope you’ve been doing well. We continue to look at the upcoming story list changes, this time focusing on how searching works.  

At its most basic, searching for something tends not to be too hard in programs. There are usually many tools, filters and other in-built functions in higher-level languages and frameworks. It only gets tricky when you have multiple queries going on at the same time. Figuring out how to make things as efficient and quick as possible without adding too much complexity on the code side of things is what makes things time-consuming.  

Let’s go from abstract to practical: tags, author name, story title and status are the four things that can affect a search in the new story list. You may recall from my post on tags that both the tag names and ones assigned to stories are kept on separate tables. Meanwhile, the other search inputs are all found in a single table. So that means that if you’re not using tags that’s a single database query. And if you are, it’ll likely be a lot of queries.  

So then a logical starting point in any search is to segregate cases between those where a user is searching only with tags and those where they’re not. Though the final code process checks if non-tags inputs are set first, I started writing the part that dealt with tags as that was the most complex.  

Whenever dealing with data entry, it’s good practice to never trust the user. If there’s a chance of somehow manipulating, corrupting or or generally inputting things that are non-intended, the program should try to block that.  

So, when something is passed on to the server from the page from the tag search field, several things happen. First, a list of valid tags is fetched from storage. Then the text string that’s the user input is sanitized—illegal characters and HTML entities removed as well as white space trimmed—and turned into an array of values. That array is then looped over with each element passing through a series of tests: does the sanitized entry match one in the list of valid tags? If it doesn’t, we send a message to the rest of the code that there’s an invalid tag and thus to return no search results. If not, this valid tag is added to a list of tags to search for.  

Once that is all complete and there are no errors, database lookups happen. For each valid tag, an array of stories that have it assigned is returned. Obviously, if there’s no story that matches one of the searching tags, the program once again returns no results. When all that is done, you need to compare the arrays and only take the story ids that are common to all tags.  

Now, there’s an in-built function in PHP that does just that but it didn’t work for me. I spent hours trying to figure it out, reading the manual and looking stuff up on the internet. It would always return all of the values in the first input array even when subsequent arrays may not have contained some of them. Best I can tell this may have been related to the callbacks I was doing to functions. Ultimately, I just decided to make my own function using some basic math and loops within loops as I was getting nowhere rewritting surrounding code. See the code snippet below:

I have no idea if this is faster than the native function but, hey, it works and I’m merely a code monkey who can’t do any better. Anyhow, the valid story id results are once again returned from that and are then converted into a variable length statement that goes for the final database lookup: the one that fetches story results, phew!

Maybe you now understand why I had to start with the story tags. By comparison, the checks for the other search fields are more straightforward. You check if there’s anything in the input field, sanitize it, add it to a variable that is part of a prepared search statement. There’s a few small sanity checks and caveats here and there to make sure nothing funny happens but then that’s all run into a straightforward database search of a single table that fetches storyid(s).  

If there are any matches, we then loop through them and check if any tags were also input. It is here that we reuse many of the previously-written functions and save ourselves a lot of work. Remember the valid tag checking bit? Well, I added to that function a parameter that, if set, checks that valid tag once more during its loop—doing a quick database check if that particular story (whose id has been passed into the function) has that valid tag assigned to it. If it doesn’t, we skip over that story result entirely in the array we’re building for that final lookup at the end.  

Once done, it checks like with just tags, running that lookup to return all the valid story matches. Those matches go through the same template file as always, just outputting the stories that are needed.   

So yeah, I know that we were a bit more technical than usual so don’t worry if you don’t really follow the jargon. The bottom line is that even something that’s simple conceptually may well prove to be complex and require a lot of tinkering to get working just right. Things like these are why I’ve put off larger overhauls such as the story list until I had enough familiarity in the code base and faith in my programming skills so that I could design something sane and write decent and maintainable code.  

Be well, guys. Until next time, take it easy!


More Creators