Skip to content

Commit

Permalink
f2fs: updates on 4.15-rc1
Browse files Browse the repository at this point in the history
Pull f2fs updates from Jaegeuk Kim:
 "In this round, we introduce sysfile-based quota support which is
  required for Android by default. In addition, we allow that users are
  able to reserve some blocks in runtime to mitigate performance drops
  in low free space.

 Enhancements:
  - assign proper data segments according to write_hints given by user
  - issue cache_flush on dirty devices only among multiple devices
  - exploit cp_error flag and add more faults to enhance fault
    injection test
  - conduct more readaheads during f2fs_readdir
  - add a range for discard commands

Bug fixes:
 - fix zero stat->st_blocks when inline_data is set
 - drop crypto key and free stale memory pointer while evict_inode is
   failing
 - fix some corner cases in free space and segment management
 - fix wrong last_disk_size

This series includes lots of clean-ups and code enhancement in terms
of xattr operations, discard/flush command control. In addition, it
adds versatile debugfs entries to monitor f2fs status"

Cherry-picked from origin/upstream-f2fs-stable-linux-4.4.y:

56a07b0 f2fs: deny accessing encryption policy if encryption is off
c394842 f2fs: inject fault in inc_valid_node_count
9262922 f2fs: fix to clear FI_NO_PREALLOC
e6cfc5d f2fs: expose quota information in debugfs
c4cd2ef f2fs: separate nat entry mem alloc from nat_tree_lock
48c72b4 f2fs: validate before set/clear free nat bitmap
baf9275 f2fs: avoid opened loop codes in __add_ino_entry
47af6c7 f2fs: apply write hints to select the type of segments for buffered write
ac98191 f2fs: introduce scan_curseg_cache for cleanup
ca28e96 f2fs: optimize the way of traversing free_nid_bitmap
460688b f2fs: keep scanning until enough free nids are acquired
0186182 f2fs: trace checkpoint reason in fsync()
5d4b6ef f2fs: keep isize once block is reserved cross EOF
3c8f767 f2fs: avoid race in between GC and block exchange
4423778 f2fs: save a multiplication for last_nid calculation
3e3b405 f2fs: fix summary info corruption
44889e4 f2fs: remove dead code in update_meta_page
55c7b95 f2fs: remove unneeded semicolon
8b92814 f2fs: don't bother with inode->i_version
42c7c71 f2fs: check curseg space before foreground GC
c547049 f2fs: use rw_semaphore to protect SIT cache
82750d3 f2fs: support quota sys files
26dfec4 f2fs: add quota_ino feature infra
ddb8e2a f2fs: optimize __update_nat_bits
f46ae95 f2fs: modify for accurate fggc node io stat
c713fdb Revert "f2fs: handle dirty segments inside refresh_sit_entry"
873ec50 f2fs: add a function to move nid
ae66786 f2fs: export SSR allocation threshold
90c28a1 f2fs: give correct trimmed blocks in fstrim
5612922 f2fs: support bio allocation error injection
583b7a2 f2fs: support get_page error injection
09a073c f2fs: add missing sysfs description
e945474 f2fs: support soft block reservation
b7b2e62 f2fs: handle error case when adding xattr entry
7368e30 f2fs: support flexible inline xattr size
ada4061 f2fs: show current cp state
5b8ff13 f2fs: add missing quota_initialize
46d4a69 f2fs: show # of dirty segments via sysfs
fc13f9d f2fs: stop all the operations by cp_error flag
91bea0c f2fs: remove several redundant assignments
807486c f2fs: avoid using timespec
03b1cb0 f2fs: fix to correct no_fggc_candidate
5c15033 Revert "f2fs: return wrong error number on f2fs_quota_write"
5f5f593 f2fs: remove obsolete pointer for truncate_xattr_node
032a690 f2fs: retry ENOMEM for quota_read|write
171b638 f2fs: limit # of inmemory pages
83ed7a6 f2fs: update ctx->pos correctly when hitting hole in directory
4d6e68b f2fs: relocate readahead codes in readdir()
c8be47b f2fs: allow readdir() to be interrupted
2b903fe f2fs: trace f2fs_readdir
bb0db66 f2fs: trace f2fs_lookup
40d6250 f2fs: skip searching non-exist range in truncate_hole
8e84f37 f2fs: expose some sectors to user in inline data or dentry case
cb98f70 f2fs: avoid stale fi->gdirty_list pointer
5562a3c f2fs/crypto: drop crypto key at evict_inode only
85853e7 f2fs: fix to avoid race when accessing last_disk_size
0c47a89 f2fs: Fix bool initialization/comparison
68e801a f2fs: give up CP_TRIMMED_FLAG if it drops discards
df74eac f2fs: trace f2fs_remove_discard
bd502c6 f2fs: reduce cmd_lock coverage in __issue_discard_cmd
a34ab5c f2fs: split discard policy
1e65afd f2fs: wrap discard policy
684447d f2fs: support issuing/waiting discard in range
27eaad0 f2fs: fix to flush multiple device in checkpoint
08bb9d6 f2fs: enhance multiple device flush
9c2526a f2fs: fix to show ino management cache size correctly
814b463 f2fs: drop FI_UPDATE_WRITE tag after f2fs_issue_flush
f555b0a f2fs: obsolete ALLOC_NID_LIST list
75d3164 f2fs: convert inline data for direct I/O & FI_NO_PREALLOC
4de0ceb f2fs: allow readpages with NULL file pointer
322a45d f2fs: show flush list status in sysfs
6d625a9 f2fs: introduce read_xattr_block
8ea6e1c f2fs: introduce read_inline_xattr
dbce11e Revert "f2fs: reuse nids more aggressively"
131bc9f Revert "f2fs: node segment is prior to data segment selected victim"

Change-Id: I93b9cd867b859a667a448b39299ff44a2b841b8c
Signed-off-by: Jaegeuk Kim <[email protected]>
  • Loading branch information
Jaegeuk Kim authored and pundiramit committed Jan 22, 2018
1 parent 2162299 commit 6522a6d
Show file tree
Hide file tree
Showing 23 changed files with 1,655 additions and 623 deletions.
43 changes: 42 additions & 1 deletion Documentation/ABI/testing/sysfs-fs-f2fs
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,18 @@ Description:
Controls the dirty page count condition for the in-place-update
policies.

What: /sys/fs/f2fs/<disk>/min_hot_blocks
Date: March 2017
Contact: "Jaegeuk Kim" <[email protected]>
Description:
Controls the dirty page count condition for redefining hot data.

What: /sys/fs/f2fs/<disk>/min_ssr_sections
Date: October 2017
Contact: "Chao Yu" <[email protected]>
Description:
Controls the fee section threshold to trigger SSR allocation.

What: /sys/fs/f2fs/<disk>/max_small_discards
Date: November 2013
Contact: "Jaegeuk Kim" <[email protected]>
Expand Down Expand Up @@ -96,6 +108,18 @@ Contact: "Jaegeuk Kim" <[email protected]>
Description:
Controls the checkpoint timing.

What: /sys/fs/f2fs/<disk>/idle_interval
Date: January 2016
Contact: "Jaegeuk Kim" <[email protected]>
Description:
Controls the idle timing.

What: /sys/fs/f2fs/<disk>/iostat_enable
Date: August 2017
Contact: "Chao Yu" <[email protected]>
Description:
Controls to enable/disable IO stat.

What: /sys/fs/f2fs/<disk>/ra_nid_pages
Date: October 2015
Contact: "Chao Yu" <[email protected]>
Expand All @@ -116,6 +140,12 @@ Contact: "Shuoran Liu" <[email protected]>
Description:
Shows total written kbytes issued to disk.

What: /sys/fs/f2fs/<disk>/feature
Date: July 2017
Contact: "Jaegeuk Kim" <[email protected]>
Description:
Shows all enabled features in current device.

What: /sys/fs/f2fs/<disk>/inject_rate
Date: May 2016
Contact: "Sheng Yong" <[email protected]>
Expand All @@ -132,7 +162,18 @@ What: /sys/fs/f2fs/<disk>/reserved_blocks
Date: June 2017
Contact: "Chao Yu" <[email protected]>
Description:
Controls current reserved blocks in system.
Controls target reserved blocks in system, the threshold
is soft, it could exceed current available user space.

What: /sys/fs/f2fs/<disk>/current_reserved_blocks
Date: October 2017
Contact: "Yunlong Song" <[email protected]>
Contact: "Chao Yu" <[email protected]>
Description:
Shows current reserved blocks in system, it may be temporarily
smaller than target_reserved_blocks, but will gradually
increase to target_reserved_blocks when more free blocks are
freed by user later.

What: /sys/fs/f2fs/<disk>/gc_urgent
Date: August 2017
Expand Down
3 changes: 3 additions & 0 deletions fs/f2fs/acl.c
Original file line number Diff line number Diff line change
Expand Up @@ -253,6 +253,9 @@ static int __f2fs_set_acl(struct inode *inode, int type,

int f2fs_set_acl(struct inode *inode, struct posix_acl *acl, int type)
{
if (unlikely(f2fs_cp_error(F2FS_I_SB(inode))))
return -EIO;

return __f2fs_set_acl(inode, type, acl, NULL);
}

Expand Down
64 changes: 49 additions & 15 deletions fs/f2fs/checkpoint.c
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@ struct kmem_cache *inode_entry_slab;
void f2fs_stop_checkpoint(struct f2fs_sb_info *sbi, bool end_io)
{
set_ckpt_flags(sbi, CP_ERROR_FLAG);
sbi->sb->s_flags |= MS_RDONLY;
if (!end_io)
f2fs_flush_merged_writes(sbi);
}
Expand Down Expand Up @@ -402,31 +401,34 @@ const struct address_space_operations f2fs_meta_aops = {
#endif
};

static void __add_ino_entry(struct f2fs_sb_info *sbi, nid_t ino, int type)
static void __add_ino_entry(struct f2fs_sb_info *sbi, nid_t ino,
unsigned int devidx, int type)
{
struct inode_management *im = &sbi->im[type];
struct ino_entry *e, *tmp;

tmp = f2fs_kmem_cache_alloc(ino_entry_slab, GFP_NOFS);
retry:

radix_tree_preload(GFP_NOFS | __GFP_NOFAIL);

spin_lock(&im->ino_lock);
e = radix_tree_lookup(&im->ino_root, ino);
if (!e) {
e = tmp;
if (radix_tree_insert(&im->ino_root, ino, e)) {
spin_unlock(&im->ino_lock);
radix_tree_preload_end();
goto retry;
}
if (unlikely(radix_tree_insert(&im->ino_root, ino, e)))
f2fs_bug_on(sbi, 1);

memset(e, 0, sizeof(struct ino_entry));
e->ino = ino;

list_add_tail(&e->list, &im->ino_list);
if (type != ORPHAN_INO)
im->ino_num++;
}

if (type == FLUSH_INO)
f2fs_set_bit(devidx, (char *)&e->dirty_device);

spin_unlock(&im->ino_lock);
radix_tree_preload_end();

Expand Down Expand Up @@ -455,7 +457,7 @@ static void __remove_ino_entry(struct f2fs_sb_info *sbi, nid_t ino, int type)
void add_ino_entry(struct f2fs_sb_info *sbi, nid_t ino, int type)
{
/* add new dirty ino entry into list */
__add_ino_entry(sbi, ino, type);
__add_ino_entry(sbi, ino, 0, type);
}

void remove_ino_entry(struct f2fs_sb_info *sbi, nid_t ino, int type)
Expand All @@ -481,7 +483,7 @@ void release_ino_entry(struct f2fs_sb_info *sbi, bool all)
struct ino_entry *e, *tmp;
int i;

for (i = all ? ORPHAN_INO: APPEND_INO; i <= UPDATE_INO; i++) {
for (i = all ? ORPHAN_INO : APPEND_INO; i < MAX_INO_ENTRY; i++) {
struct inode_management *im = &sbi->im[i];

spin_lock(&im->ino_lock);
Expand All @@ -495,6 +497,27 @@ void release_ino_entry(struct f2fs_sb_info *sbi, bool all)
}
}

void set_dirty_device(struct f2fs_sb_info *sbi, nid_t ino,
unsigned int devidx, int type)
{
__add_ino_entry(sbi, ino, devidx, type);
}

bool is_dirty_device(struct f2fs_sb_info *sbi, nid_t ino,
unsigned int devidx, int type)
{
struct inode_management *im = &sbi->im[type];
struct ino_entry *e;
bool is_dirty = false;

spin_lock(&im->ino_lock);
e = radix_tree_lookup(&im->ino_root, ino);
if (e && f2fs_test_bit(devidx, (char *)&e->dirty_device))
is_dirty = true;
spin_unlock(&im->ino_lock);
return is_dirty;
}

int acquire_orphan_inode(struct f2fs_sb_info *sbi)
{
struct inode_management *im = &sbi->im[ORPHAN_INO];
Expand Down Expand Up @@ -531,7 +554,7 @@ void release_orphan_inode(struct f2fs_sb_info *sbi)
void add_orphan_inode(struct inode *inode)
{
/* add new orphan ino entry into list */
__add_ino_entry(F2FS_I_SB(inode), inode->i_ino, ORPHAN_INO);
__add_ino_entry(F2FS_I_SB(inode), inode->i_ino, 0, ORPHAN_INO);
update_inode_page(inode);
}

Expand All @@ -555,7 +578,7 @@ static int recover_orphan_inode(struct f2fs_sb_info *sbi, nid_t ino)
return err;
}

__add_ino_entry(sbi, ino, ORPHAN_INO);
__add_ino_entry(sbi, ino, 0, ORPHAN_INO);

inode = f2fs_iget_retry(sbi->sb, ino);
if (IS_ERR(inode)) {
Expand Down Expand Up @@ -591,6 +614,9 @@ int recover_orphan_inodes(struct f2fs_sb_info *sbi)
block_t start_blk, orphan_blocks, i, j;
unsigned int s_flags = sbi->sb->s_flags;
int err = 0;
#ifdef CONFIG_QUOTA
int quota_enabled;
#endif

if (!is_set_ckpt_flags(sbi, CP_ORPHAN_PRESENT_FLAG))
return 0;
Expand All @@ -603,8 +629,9 @@ int recover_orphan_inodes(struct f2fs_sb_info *sbi)
#ifdef CONFIG_QUOTA
/* Needed for iput() to work correctly and not trash data */
sbi->sb->s_flags |= MS_ACTIVE;

/* Turn on quotas so that they are updated correctly */
f2fs_enable_quota_files(sbi);
quota_enabled = f2fs_enable_quota_files(sbi, s_flags & MS_RDONLY);
#endif

start_blk = __start_cp_addr(sbi) + 1 + __cp_payload(sbi);
Expand Down Expand Up @@ -632,7 +659,8 @@ int recover_orphan_inodes(struct f2fs_sb_info *sbi)
out:
#ifdef CONFIG_QUOTA
/* Turn quotas off */
f2fs_quota_off_umount(sbi->sb);
if (quota_enabled)
f2fs_quota_off_umount(sbi->sb);
#endif
sbi->sb->s_flags = s_flags; /* Restore MS_RDONLY status */

Expand Down Expand Up @@ -987,7 +1015,7 @@ int f2fs_sync_inode_meta(struct f2fs_sb_info *sbi)
update_inode_page(inode);
iput(inode);
}
};
}
return 0;
}

Expand Down Expand Up @@ -1147,6 +1175,7 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
struct super_block *sb = sbi->sb;
struct curseg_info *seg_i = CURSEG_I(sbi, CURSEG_HOT_NODE);
u64 kbytes_written;
int err;

/* Flush all the NAT/SIT pages */
while (get_pages(sbi, F2FS_DIRTY_META)) {
Expand Down Expand Up @@ -1240,6 +1269,11 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
if (unlikely(f2fs_cp_error(sbi)))
return -EIO;

/* flush all device cache */
err = f2fs_flush_device_cache(sbi);
if (err)
return err;

/* write out checkpoint buffer at block 0 */
update_meta_page(sbi, ckpt, start_blk++);

Expand Down
38 changes: 27 additions & 11 deletions fs/f2fs/data.c
Original file line number Diff line number Diff line change
Expand Up @@ -172,7 +172,7 @@ static struct bio *__bio_alloc(struct f2fs_sb_info *sbi, block_t blk_addr,
{
struct bio *bio;

bio = f2fs_bio_alloc(npages);
bio = f2fs_bio_alloc(sbi, npages, true);

f2fs_target_device(sbi, blk_addr, bio);
bio->bi_end_io = is_read ? f2fs_read_end_io : f2fs_write_end_io;
Expand Down Expand Up @@ -417,8 +417,8 @@ int f2fs_submit_page_write(struct f2fs_io_info *fio)

bio_page = fio->encrypted_page ? fio->encrypted_page : fio->page;

/* set submitted = 1 as a return value */
fio->submitted = 1;
/* set submitted = true as a return value */
fio->submitted = true;

inc_page_count(sbi, WB_DATA_TYPE(bio_page));

Expand Down Expand Up @@ -472,7 +472,7 @@ static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
f2fs_wait_on_block_writeback(sbi, blkaddr);
}

bio = bio_alloc(GFP_KERNEL, min_t(int, nr_pages, BIO_MAX_PAGES));
bio = f2fs_bio_alloc(sbi, min_t(int, nr_pages, BIO_MAX_PAGES), false);
if (!bio) {
if (ctx)
fscrypt_release_ctx(ctx);
Expand Down Expand Up @@ -832,6 +832,13 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
struct f2fs_map_blocks map;
int err = 0;

/* convert inline data for Direct I/O*/
if (iocb->ki_flags & IOCB_DIRECT) {
err = f2fs_convert_inline_inode(inode);
if (err)
return err;
}

if (is_inode_flag_set(inode, FI_NO_PREALLOC))
return 0;

Expand All @@ -844,15 +851,11 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)

map.m_next_pgofs = NULL;

if (iocb->ki_flags & IOCB_DIRECT) {
err = f2fs_convert_inline_inode(inode);
if (err)
return err;
if (iocb->ki_flags & IOCB_DIRECT)
return f2fs_map_blocks(inode, &map, 1,
__force_buffered_io(inode, WRITE) ?
F2FS_GET_BLOCK_PRE_AIO :
F2FS_GET_BLOCK_PRE_DIO);
}
if (iocb->ki_pos + iov_iter_count(from) > MAX_INLINE_DATA(inode)) {
err = f2fs_convert_inline_inode(inode);
if (err)
Expand Down Expand Up @@ -1332,7 +1335,7 @@ static int f2fs_read_data_pages(struct file *file,
struct address_space *mapping,
struct list_head *pages, unsigned nr_pages)
{
struct inode *inode = file->f_mapping->host;
struct inode *inode = mapping->host;
struct page *page = list_last_entry(pages, struct page, lru);

trace_f2fs_readpages(inode, page, nr_pages);
Expand Down Expand Up @@ -1493,6 +1496,7 @@ static int __write_data_page(struct page *page, bool *submitted,
int err = 0;
struct f2fs_io_info fio = {
.sbi = sbi,
.ino = inode->i_ino,
.type = DATA,
.op = REQ_OP_WRITE,
.op_flags = wbc_to_write_flags(wbc),
Expand Down Expand Up @@ -1564,8 +1568,11 @@ static int __write_data_page(struct page *page, bool *submitted,
err = do_write_data_page(&fio);
}
}

down_write(&F2FS_I(inode)->i_sem);
if (F2FS_I(inode)->last_disk_size < psize)
F2FS_I(inode)->last_disk_size = psize;
up_write(&F2FS_I(inode)->i_sem);

done:
if (err && err != -ENOENT)
Expand Down Expand Up @@ -1945,6 +1952,12 @@ static int f2fs_write_begin(struct file *file, struct address_space *mapping,
}
trace_f2fs_write_begin(inode, pos, len, flags);

if (f2fs_is_atomic_file(inode) &&
!available_free_memory(sbi, INMEM_PAGES)) {
err = -ENOMEM;
goto fail;
}

/*
* We should check this at this moment to avoid deadlock on inode page
* and #0 page. The locking rule for inline_data conversion should be:
Expand All @@ -1960,7 +1973,8 @@ static int f2fs_write_begin(struct file *file, struct address_space *mapping,
* Do not use grab_cache_page_write_begin() to avoid deadlock due to
* wait_for_stable_page. Will wait that below with our IO control.
*/
page = grab_cache_page(mapping, index);
page = f2fs_pagecache_get_page(mapping, index,
FGP_LOCK | FGP_WRITE | FGP_CREAT, GFP_NOFS);
if (!page) {
err = -ENOMEM;
goto fail;
Expand Down Expand Up @@ -2021,6 +2035,8 @@ static int f2fs_write_begin(struct file *file, struct address_space *mapping,
fail:
f2fs_put_page(page, 1);
f2fs_write_failed(mapping, pos + len);
if (f2fs_is_atomic_file(inode))
drop_inmem_pages_all(sbi);
return err;
}

Expand Down
Loading

0 comments on commit 6522a6d

Please sign in to comment.